ISISComputingGroup / IBEX

Top level repository for IBEX stories
4 stars 2 forks source link

ZOOM: Employ the collision detection system to the baffle and detector trolley axes #2936

Closed KathrynBaker closed 6 years ago

KathrynBaker commented 6 years ago

As someone concerned with the motion on ZOOM I want to be able to automate the soft limits on the detector and baffle axes to avoid triggering the bump strips in standard operation of the system.

Acceptance Criteria

  1. The soft limits on the two axes are updated as appropriate when each changes.
kjwoodsISIS commented 6 years ago

Contact @jonelmer if we need specific details.

DominicOram commented 6 years ago

From a code chat the following needs to be done as part of this ticket to make the collision detection software into a MVP, in order of priority:

The following would be nice to do as part of the MVP:

The following are features that would be useful for further development:

This ticket will be timeboxed to 3 days, create new ticket(s) for any of the above that does not get completed.

AdrianPotter commented 6 years ago

Putting back to ready until I've finished some other tasks in case someone has time before me

Tom-Willemsen commented 6 years ago

I've reached the end of my timebox on this issue.

What I've done:

main PR is here: https://github.com/ISISComputingGroup/EPICS-inst_servers/pull/156/files genie_python dependencies here: https://github.com/ISISComputingGroup/genie_python/pull/122

AdrianPotter commented 6 years ago

image

AdrianPotter commented 6 years ago

To explain the image above:

The code cleanup Tom has done looks good, but I can't mark this ticket as complete until the basic collision avoidance is more reliable.

kjwoodsISIS commented 6 years ago

I am concerned about this. From the above description, I am unsure as to how we are determining whether there has been a collision or not.

jonelmer commented 6 years ago

My first thought is to try testing at lower speed, or adjusting the oversize parameter.

AdrianPotter commented 6 years ago

The motors do stop before they reach their assigned target positions, it's just not soon enough. I suspect at lower speeds it would stop in time, but I'm not happy giving the green light to a collision system that only works in a subset of our parameter space. I don't think the relative speed of the blocks was unreasonable (30s-60s to reach each other from their starting position)

Having just discussed with Kevin and Kathryn, the expectation is that the parameters that ZOOM use may be sufficient for the CDS to stop the motors in time. However, we don't know the extent of the parameter space where the CDS can be trusted, which introduces a very high degree of uncertainty for instruments who want to use it.

It was suggested that the CDS could be accepted in its current form, the limitation can be documented and communicated to scientists. I think that leaves a high risk of it being forgotten and leading to a collision somewhere down the line. In spite of any warnings we might deliver, the CDS is likely to be blamed. If we are to go down that route, I think we need to very explicitly warn people via the software, on each use, that the CDS has limited capacity.

I don't know what the oversize parameter is, I only approached this ticket from a testing point of view. Is it fixed? It seems to me that a sensible oversize parameter should be a function of speed and deceleration. If it is a fixed number then it will always be limited in its capacity to predict the necessary stopping distance. If it's variable then its current formulation isn't accurately predicting the necessary oversize.

I will arrange a meeting with Tom, John and me for next week so we can discuss this issue.

kjwoodsISIS commented 6 years ago

I think this meeting needs to establish the following things:

  1. Is the CDS being used correctly?
    1. if so, why did Adrian's tests flag up a potential problem?
    2. there is CDS documentation here
  2. Are the parameters currently used on ZOOM the appropriate parameters
    1. if so, we should document them and state why (e.g. in the ZOOM wiki) they are appropriate.
    2. if not, we should determine the appropriate parameters, document them and again state why they are appropriate
  3. How are we going to prove it works on ZOOM itself
    1. presumably, during the shutdown, we can install & test it on ZOOM
    2. what special precautions do we need to take to make sure we don't damage anything during testing?
    3. how much visibility do we have of the detectors & baffle while they are moving?

Once we are happy that it is OK to deploy the CDS, we also need to do the following:

  1. Write some simple documentation for the scientists
    1. The documentation should include a warning to the effect that
      1. the setup & parameters of the CDS have been carefully defined to prevent collisions
      2. if the setup & parameters are changed, without consuting the motion-control team, you risk causing a collision and potentially causing damage to equipment
  2. Inform the scientists that the CDS has been deployed
  3. Demonstrate the CDS to the scientists
    1. During the demo, impress upon the scientists that changing the setup & parameters risks causing a collision & consequent damage.

The scientists have a duty to use the equipment safely and within operational parameters. We (computing controls & motion control) have a duty to make sure the scientists know what the operational boundaries are. If we need to put some warning text on the GUI (e.g. saying don't change parameters unless you know what you are doing), we should do so.

AdrianPotter commented 6 years ago

The meeting has already happened I'm afraid, but I can summarise the discussion and comment as far as possible on the points above.

To summarise:

To answer your questions specficially @Kevin-Woods-Tessella

  1. The current use of the system is that the prototype has been deployed on ZOOM to see if it can avoid scientists triggering their bump strips and avoid some manual resets.
    1. The CDS is itself a background task and so the effect on user behaviour is unknown until usage is determined from logs and user interviews
    2. Responsibility for the collection and analysis of the above data is with the motion controls team.
    3. My tests had three possible features that caused an issue:
      1. Simulated motor behaviour does not always match real motor behaviour, particularly around soft limits
      2. The speeds were higher in my tests than on ZOOM
      3. I was moving both motors simultaneously
    4. All three of the factors leave considerable unknowns about the operationally valid conditions for the CDS to stop motors before collisions. @jonelmer argued strongly that the CDS should be a monitoring and avoidance system, rather than a prevention system, and motion control will never expect it to prevent safety critical collisions.
    5. Based on the above, this will undoubtedly affect prioritisation of the future development and roll out of the system. If it does not prove to deliver reliable enhancement to the user experience then it is less likely to have resources allocated to it.
  2. The parameters being used on ZOOM were implemented using the measured distances as provided by @DominicOram.
    1. [ ] TODO: Record these in the wiki
  3. The intention for the CDS on ZOOM:
    1. It will record data during the course of the current cycle and provide information on the use of the system and its responses.
    2. The goal for ZOOM is for the CDS to reduce the frequency with which bump strips are hit and need to be manually reset.
    3. The scientists have been explicitly told that it is a trial system and not to be relied on to prevent collisions
    4. There is no risk of damage on ZOOM because the existing hardware bump strips will prevent it. The system should in theory stop the motors before the bump strips are triggered, but if it fails to do so it will not cause damage to the equipment.
    5. The motion control team will analyse the log data for the CDS before new requirements are produced.
    6. Notably in the cases of IMAT and LARMOR, the use case is very different because the type of motion which could cause collisions wouldn't be stopped with bump strips. This is very different from ZOOM and significant extra work would be needed before supporting that use case.
      1. This has been made clear to @jonelmer as a representative of the motion control team
      2. It is unknown whether any expectation exists on LARMOR or IMAT for this system to eventually reach them and, if so, on what timescale.
    7. [x] TODO: @KathrynBaker, does the above match with your understanding of how the system is being trialled on ZOOM? (Confirmed with a conversation in the office just now)
      1. @jonelmer has confirmed that further roll out of the system will only happen after the ZOOM trial is analysed and that the prioritisation and scoping of future requirements will depend on the system working as intended at each stage. We have made it clear that the system in its current form is limited and any resourcing from the IBEX team cannot be guaranteed. Critical instrument requirements to the motion controls team for safety and collision avoidance will not depend on IBEX and the CDS

As on the pull requests, I'm happy for the existing code to be merged. It causes not obvious changes in the behaviour of the system, but reduces the technical debt significantly. There is a ticket for addressing the remaining debt.

Contrary to the name of the ticket, the system has already been rolled out on ZOOM in its existing form as part of the maintenance day activities of this cycle. There is therefore little to no risk of further impact from merging the code.

In terms of actions:

  1. Scientists should not be interacting with the CDS. It should operate as a background helper to catch collisions.
  2. The current setup is hard-coded to ZOOM's configuration and cannot readily be accessed by the scientists.
  3. I'm told the scientists are already aware it is active, and its limitations.

As actions from the meeting:

  1. I will update the documentation based on our current knowledge from the meeting and discussion on this ticket
  2. @Tom-Willemsen will, as part of the rework, rename the system internally to something that has less of an implication that this system is about preventing collisions (e.g. "dynamic limit monitor").

I hope that clarifies the situation. As the remaining functional part of this ticket is purely the merging of @Tom-Willemsen's pull request, I am happy to merge it once the above changes have been made. If there is additional discussion needed about the requirements and future scope of this sub project then I recommend it take place outside of this ticket.

kjwoodsISIS commented 6 years ago

@AdrianPotter - thanks for the clarifications. These address my concerns. I agree with @jonelmer 's argument that the CDS should be seen as strictly "a monitoring and avoidance system". I think the suggested name - "dynamic limit monitor" is a little too obscure. Perhaps we could call it something like "collision assessment monitor" (or similar)

kjwoodsISIS commented 6 years ago

Suggested enhancements for the CDS were originally proposed in #2040. These have now been moved to the CDS Wiki page.

AdrianPotter commented 6 years ago

For future reference: The schematic of ZOOM from the office white board

img_20180514_133818

AdrianPotter commented 6 years ago

I have cross-referenced with the wiki. If we could just decide and update the name then I will merge and close the ticket.

In terms of the name. Collision Assessment Monitor is fine with me (also has the abbreviation CAM which is better than CDS or DLM)

kjwoodsISIS commented 6 years ago

OK. Decision made - let's go with CAM.

KathrynBaker commented 6 years ago

This has been tested on the live system, and a new ticket (#3220) has been created to investigate and solve that problem, the system as it currently is is certainly usable and appropriate for use on ZOOM for the time being, with one minor change required. The Oversize is currently set to 5, and a setting of 8 is more appropriate given the difference in location of the actual ends of the bump stop triggers, and so the value in the configuration needs updating to reflect that.