Add save/load - Githubissues

olliesilvester commented 1 year ago

As a follow up to https://github.com/bluesky/ophyd-epics-devices/issues/28, a method should be added to save all PV's of a device to a yaml file, and a load method should be added to set the device to a state specified in the yaml file.

callumforrester commented 8 months ago

Co-opting this into a design issue. Writeup of a discussion with @coretl, @tpoliaw and @abbiemery:

Malcolm's current setup is a large bank of settings saved to a yaml file in order and restored in reverse order. Then on top of that, particular per-device logic may be invoked that sets specific signals in a specific order. In other words a broad-stoke followed by a fine-grained approach where needed. This is useful for approximately capturing the desired state without too much code overhead, but it must always be best-effort as the starting state is unknown. It also presents difficulties if the user wishes to leave a particular setting untouched, perhaps because they are tweaking it manually and experimenting. The question is how close ophyd-async's load/save is to malcolm's

There was some discussion about an inverse system: rather than capture all possible state only add explicit logic to set the state we know we care about i.e. state that makes a difference to the scan we want to run. Over time as edge-cases are discovered that add more constraints on the state, we add these to the plan/prepare method/yaml file/whatever is responsible for defining the state. The principal disadvantage of this is it is extra code overhead that must be manually maintained, but hopefully that maintenance is spread out over time.

There was also discussion about decoupling load/save from flyers, so we provide a default implementation that they can use but don't have to, leaving open the possibility of others writing their own version that looks different or not using load/save at all.

coretl commented 8 months ago

We talked a lot about requirements that we can't currently check, like "Eiger trigger connected to PandA1 TTLOUT1", that would enable checks like "Sequencer has some connection to TTLOUT1". This is a good idea, but can be put in as a future enhancement.

The decision we need to make now is how to get from "I've poked my EPICS system into a working state and it runs a scan" to "my scan runs when the beamline has been doing something else". There are some options for this:

Change settings in Python, either in the plan or in the Device
Have an intelligent save, that structures the YAML file into phases, and excludes things based on the plan that will be running
Have an intelligent load, that takes a blob of settings as requirements, and knows how to apply them to the device

We were going for 2. because of ease of saving and re-saving the same things, but I'm wondering now if moving towards 3. would be better. This would mean that we can do a generic "save_device" plan with no saving into phases, but have an intelligent "load_device" that would only put the signals that had changed. We could still do user ignoring of specific parameters in the save file, but the saved file looks a lot more like the requirements file with the trigger wiring in it, which moves us in the right direction

olliesilvester commented 8 months ago

If you went for intelligent load, it would be useful for the generic save store a list of settings which still need changing after the load, things like the panda sequencer table's values.

Also, if it's helpful for discussion, Hyperion is currently using option 2 in https://github.com/DiamondLightSource/hyperion/blob/697cfffd78481312cccf4baad0c0358b560a54da/src/hyperion/device_setup_plans/setup_panda.py#L106

callumforrester commented 8 months ago

I think we should do because 1. is the most flexible option, but I also think it does not exclude 2. or 3. If we want, 3. we can just we the Python function that, in practice, is used to change the settings in most cases, but the door is still left open for special cases.

@coretl 3. looks interesting but I'm not sure how it would handle setting up a device differently for different types of scan?

coretl commented 4 months ago

Discussions with @callumforrester @olliesilvester @dperl-dls lead me to think a hybrid of 1 and 3 makes the most amount of sense:

Provide a generic save_device_yaml(device, yaml_path) plan that write all Signals (even read-only ones) to a YAML file
This is used as a baseline for a working Device, regardless of settings
Provide a series of setup_x_to_do_y(device, yaml_path, ...) intelligent load plans that will take the current device Signals, the baseline values, and a set of programmatically defined values, and apply them to the device in the right order, avoiding unnecessary puts

For example:

@dataclass
class ChangeSet:
    current: Dict[SignalW[T], T]
    required: Dict[SignalW[T], T]

    def require(self, signal: SignalW[T], value: T):
        self.required[signal] = value

    def changes(self) -> Dict[SignalW[T], T]:
        return {signal: value for signal, value in self.required.items() if self.current[signal] != value}

def changeset_to_restore(baseline: Dict[SignalW[T], T]) -> ChangeSet:
    current_values = yield ("locate", *baseline) 
    return ChangeSet(
        current = {signal: location["setpoint"] for signal, location in zip(baseline, current_values)
        required = baseline
    )

def write_changes_to_device(changes: Dict[SignalW[T], T]):
    group = uid()
    for signal, value in changes.items():
        yield from abs_set(signal, value, group=group)
    yield from wait(group)

def write_changes_to_panda(changes: Dict[SignalW[T], T]):
    # Remove the units from the changes dict as they need to be done first
    units = {signal: changes.pop(signal) for signal in changes if signal.name.endswith("units")}
    yield from write_changes_to_device(units)
    # Do everything else
    yield from write_changes_to_device(changes)

def setup_panda_for_repeated_triggers(baseline: Dict[SignalW[T], T], seq: SeqBlock, repeated_trigger_info):
    changeset = yield from changeset_to_restore(baseline)
    changeset.require(seq.table, repeated_trigger_info.generate_table())
    changeset.require(seq.repeats, 1)
    changeset.require(seq.enable, 0)
    yield from write_changeset_to_panda(changeset.changes())

Thoughts?

callumforrester commented 4 months ago

...maybe...

The issue, I think, is that this is a best-effort solution, meaning you have to make sure it doesn't cause more problems than it solves. The way I see it there are a few possible cases:

The user is unaware that restoring saved state would overwrite a PV they have deliberately tweaked
The user is unaware that tweaking some not-directly-related PVs will cause problems with their measurement
The user is unaware that a PV they expect to be restored is actually ignored
The user is unaware that the beamline is in the wrong state when they save
A special procedure is required to get to the desired state (i.e. more logic than just load())
...I've probably missed some

What I want is a pie chart showing how common each case is, so we know where best to direct the best-effort.

coretl commented 4 months ago

...maybe...

The issue, I think, is that this is a best-effort solution, meaning you have to make sure it doesn't cause more problems than it solves. The way I see it there are a few possible cases:
1. The user is unaware that restoring saved state would overwrite a PV they have deliberately tweaked

2. The user is unaware that tweaking some not-directly-related PVs will cause problems with their measurement

3. The user is unaware that a PV they expect to be restored is actually ignored

4. The user is unaware that the beamline is in the wrong state when they save

5. A special procedure is required to get to the desired state (i.e. more logic than just `load()`)

6. ...I've probably missed some
What I want is a pie chart showing how common each case is, so we know where best to direct the best-effort.

We cannot make a pie chart yet because we do not have an exhaustive save mechanism at present. We have GDA, which gives us situation 2 quite a lot when someone prods an areaDetector PV that GDA is not setting and breaks a scan. We have Malcolm which gives us 1 when we switch to using a saved design for a scan for the first time, 3 because the set of PVs to be restored is not exhaustive. I suspect that we also get 4 with Malcolm, because it is easy to do with a PandA, and the saved design for PandA is used in many scan designs. We have never encountered 5 yet, because the logic has been wrapped up in the scan phase, not the load phase.

I suggest we widen the discussion to @tomtrafford and @EdWarrick to see if they have any experience of which is more common and where we should be putting our effort

olliesilvester commented 4 months ago

Thoughts?

This seems good to me, at least for a PandA. Is it up to a user to get their baseline for a device? To me, there are two separate baselines.

Reset device to factory default settings
Reset device to a plan-specific state. Eg, configure and wire up a PandA's blocks, but don't touch things that need to be changed on every collection, like the sequencer table's values.

coretl commented 4 months ago

Thoughts?

This seems good to me, at least for a PandA. Is it up to a user to get their baseline for a device? To me, there are two separate baselines.
1. Reset device to factory default settings

2. Reset device to a plan-specific state. Eg, configure and wire up a PandA's blocks, but don't touch things that need to be changed on every collection, like the sequencer table's values.

Yes, I imagine that you poke a PandA til the plan works, then save a baseline, then incorporate it into a setup_* plan stub like above. I also imagine we would only store baselines for plan specific state.

coretl commented 4 months ago

There's one other option we didn't think of, we could always do the generic save part for the scan, emit it as part of the document stream, but never actually do the load from yaml file part. Then we would only programatically change settings by writing it in Python, but never actually load a yaml file, however we could always look back on a working scan and diff the two settings to see if we can work out what changed.

I know @callumforrester initially favoured this route, but I don't know how well it works for PandA

olliesilvester commented 4 months ago

There are so many settings you can change and forget about on the PandA that I think doing a load at some point is very useful, especially if you've been playing around with the config on the web gui

callumforrester commented 4 months ago

Looking at these again

The user is unaware that restoring saved state would overwrite a PV they have deliberately tweaked

The user is unaware that tweaking some not-directly-related PVs will cause problems with their measurement

@olliesilvester I agree with you if 2. is more common than 1. because then the precise state of all the possibly-related minutiae will be restored and the user doesn't have to worry. If 1 is more common then restoring a blanket state creates more problems than it solves.

bluesky / ophyd-async

Add save/load #16