bluesky / ophyd

hardware abstraction in Python with an emphasis on EPICS
https://blueskyproject.io/ophyd
BSD 3-Clause "New" or "Revised" License
51 stars 79 forks source link

Add Long name to OphydObj #1187

Open cjtitus opened 7 months ago

cjtitus commented 7 months ago

As discussed in #1161 this adds an attribute long_name to the base OphydObj, and also slightly modifies the Signal classes to pass long_name through.

If long_name is not set, it will default to returning name, so that you can just use the attribute as a display label without worrying about whether or not it was set.

long_name can be passed into components. It is not concatenated with its parent device's long_name, which I think would be undesirable behavior. It is something that should be very intentionally set to provide a self-contained descriptive label for whatever it is passed to.

cjtitus commented 7 months ago

Possibly one question to resolve in this PR is whether the long_name should be saved anywhere. Currently, it's not going to be read into Tiled at all when you read/describe an object.

I think this is the desired behavior, we would just need to emphasize in the docs that long_name is for display only.

prjemian commented 7 months ago

My crystal ball tells me there will be a request to save long_name (if it is defined) so data clients can use it. The intention in NeXus was to supply something more descriptive than the short name for the purposes of a plot or axis title.

cjtitus commented 7 months ago

So the next question becomes where and how long_name could be saved. It's not a signal, it's an attribute, so it's not going to get picked up by describe/read, or configure.

Name is only saved implicitly, by being the key that is used to identify everything in the documents.

I was going to go through the list of pros and cons of making a long_name AttributeSignal that gets read, but this only works for devices, not a pure signal. Whereas part of my motivation is

ring_current = EpicsSignalRO(<prefix>, name="ring_current", long_name= "NSLS-II Ring Current (mA)")

And it's clear here that the ring_current signal couldn't itself have a long_name signal, because ring_current is an EpicsSignalRO, and I'm not going to change everything into a device.

So in this framework, does anybody have any ideas for how we'd get the long_name attribute saved?

I suppose, if we wanted to be maximalist about things, I could modify describe so that it picks up long_name if one is available, and then you have

In [11]: ring_current.describe()
Out[11]: 
{'ring_current': {'source': 'PV:SIM_SST:current',
  'long_name': 'NSLS-II Ring Current (mA)',
  'dtype': 'number',
  'shape': [],
  'units': '',
  'lower_ctrl_limit': 0.0,
  'upper_ctrl_limit': 0.0,
  'precision': 2}}

This to me seems like a pretty major change, even though it's probably backwards compatible, and probably won't break anything (unless someone's iterating through describe in such a way that an unexpected key causes problems). It would be a large change in philosophy, in the sense that we are no longer pretending that all information is coming from EPICS records.

prjemian commented 7 months ago

pros and cons of making a long_name AttributeSignal that gets read, but this only works for devices, not a pure signal. Whereas part of my motivation is

Example of AttributeSignal that derives from not a Device:

    class_name = Cpt(
        # fmt: off
        AttributeSignal,
        attr="__class__.__name__",
        doc="Diffractometer class name",
        write_access=False,
        # fmt: on
    )
prjemian commented 7 months ago

Sorry, I see your point now. The AttributeSignal class cannot be used outside of a Device. The attr= keyword relies on that context

cjtitus commented 7 months ago

In that example, class_name is an AttributeSignal that is part of the diffractometer device.

An EpicsSignalRO could not itself have a nested AttributeSignal that would carry the long_name of the EpicsSignalRO.

cjtitus commented 7 months ago

Any thoughts on adding the new attribute to describe?

coretl commented 7 months ago

I think it's a good idea, conversation with @danielballan reminded me that DataKey can contain arbitrary keys too, so shouldn't break anything. I guess long_name is a good key too as that is the attribute, and I guess we should suppress it if it hasn't been set?

cjtitus commented 7 months ago

@coretl You lost me with the reference to DataKey. Are you suggesting an alternate place for the long_name attribute to be stored? Or are you agreeing that long_name can go under describe?

coretl commented 7 months ago

@coretl You lost me with the reference to DataKey. Are you suggesting an alternate place for the long_name attribute to be stored? Or are you agreeing that long_name can go under describe?

I'm agreeing that it can go under describe. The reference to DataKey comes from what bluesky expects describe to produce: https://github.com/bluesky/bluesky/blob/5df94da730e51e6432420bab92a7f92ec06c8a49/src/bluesky/protocols.py#L283-L303

which is defined here: https://github.com/bluesky/event-model/blob/fc6e30eae896073eda4013f7c4af3362fead61c1/event_model/documents/event_descriptor.py#L10-L60

So we should eventually add it as an optional entry there, so that downstream consumers know "if there is a field called long_name, this is what you can use it for"

cjtitus commented 7 months ago

@coretl I understand now, thanks. Wasn't familiar with the EventModel terminology.

I'm going to go ahead and add long_name to the describe method for all the built-in signals. The way I've implemented things, long_name falls back on name if it is not defined. I think that for the purposes of people trying to write nexus exporters, it will probably be more helpful if long_name is always present (but sometimes redundant) rather than sometimes present.

If anyone has a different opinion, I could add a check for the underlying _long_name attribute to see if it really exists.

cjtitus commented 7 months ago

There is a bit of a wrinkle now, in that the really aggressively signal-based nature of Ophyd comes back to bite us. If you gave you device as a whole a long_name, there's no way to save that. You just get info about the signals, and a device doesn't really even have a notion of information about itself that's not just stored in a signal...

In [5]: tes.describe()
Out[5]: 
OrderedDict([('tes_mca_counts',
              {'source': 'PV:SIM_SST:tesmca:COUNTS',
               'dtype': 'integer',
               'shape': [],
               'units': '',
               'lower_ctrl_limit': 0,
               'upper_ctrl_limit': 0}),
             ('tes_mca_spectrum',
              {'source': 'PV:SIM_SST:tesmca:SPECTRUM',
               'dtype': 'array',
               'shape': [800],
               'units': '',
               'lower_ctrl_limit': 0,
               'upper_ctrl_limit': 0})])

In [6]: tes.long_name
Out[6]: 'Microcalorimeter Detector'

In [7]: tes
Out[7]: EpicsMCABase(prefix='SIM_SST:tesmca:', name='tes_mca', read_attrs=['counts', 'spectrum'], configuration_attrs=['exposure_time', 'llim', 'ulim', 'nbins', 'energies'])

This kind of cuts to the core of the Ophyd model (and, I would say, exposes some of the inadequacy in the model) where a device is just a collection of signals that get bundled together. Not really sure where to go from here while staying within the current paradigm.

coretl commented 7 months ago

@tacaswell any ideas?

cjtitus commented 7 months ago

The major problem is actually this: If you're going to pass a long_name to anything, it's going to be a device, which is what you actually instantiate. The Components of the device are all going to be defined in the class, so you don't get an opportunity to pass in a long_name from outside.

Ophyd doesn't really have the concept of a "detector" structure that would have a primary "data" signal that the long_name could flow through to, even though this is how all of my detectors are actually structured.

Now, I suppose I could modify all of my own detector classes to just pass the device long_name down to a particular signal. But this doesn't help anybody else who expects the long_name for their own device to be saved, somehow.

tacaswell commented 7 months ago

I think long_name has to be propagated down like name.
https://github.com/bluesky/ophyd/blob/5c03c3fff974dc6390836fc83dae4c247a35e662/ophyd/device.py#L252-L274 is the code where we actually instantiate the child devices and what ever logic is needed to propogate the long_name should be added there. It is not immediately clear to be what the "right" way to do this propagation is though.

Another option is to let long_name be run-time settable and as part of the init process set it from the outside.

https://github.com/bluesky/event-model/blob/fc6e30eae896073eda4013f7c4af3362fead61c1/event_model/documents/event_descriptor.py#L139-L147 The object_keys entry at the top level lets you re-group the data keys based on what device it comes from. https://github.com/bluesky/bluesky/blob/aaa8822d0747d67bc84333ba95db99187ee62b8e/src/bluesky/bundlers.py#L189 we could change that to try long_name if it exists and name if it does not but that would be an API change and one of the selling points of this is that the value of the long name shows up as a value (not as a key) so we do not have to put any rules / validation on its form.

Ophyd doesn't really have the concept of a "detector" structure that would have a primary "data" signal that the long_name could flow through to, even though this is how all of my detectors are actually structured.

Look at the extra work that is done in EpicsMotor which is an example of a "detector" which has a primary "data" (the RBV) signal that currently the name of the parent flows down to with out any post-fixes being tacked on.

cjtitus commented 7 months ago

I didn't want long_name to propagate like name, because then you end up with these awful strings like "My Super Special Detector_counts". The whole motivation for this PR in the first place is to separate human-readable, "space-separated" strings from the compact names that make nice dictionary keys, but with a way to map from convenient "device keys" to the readable "long_name" for automatic plot labeling, export to nexus, etc.

cjtitus commented 7 months ago

One possible thing to do would be to propagate long_name by concatenating with a space, rather than an underscore.

So then you could define long_name for a device as "My Device", and long_name for the signal as "Counts", and they could be combined to yield "My Device Counts", which is at least readable, albeit this may be a bit more verbose than you want sometimes. I suppose the main signal could always have its long_name = "" and then you get one signal that just has its parents' long_name. How does this sound?

tacaswell commented 7 months ago

Concatenating with ' ', allowing a component to set long_name='' to mean "use my parent's long name" and long_name=None to mean "do not give me a long name" seems light the best path to me.

It might also be worth coming up with someway to spell "the long_name was set correctly in the kwargs passed to Component, do not mess with it" but that can be added later.

coretl commented 7 months ago

@tacaswell I still don't quite get this, the verbs we support are describe and describe_configuration, each of which give you a Dict[str, DataKey], which is the ideal place to put long_name for each child Signal, but there is nowhere here to return long_name for the Device itself. There is also not an easy place to put this in the Descriptor, unless we manufacture a <detectorname>-long_name DataKey to stuff into the configuration section of the Descriptor...

tacaswell commented 7 months ago

Right, we can easily add long_name for signals to the deepest dictionary in data_keys, getting the long name of the top-most device is harder.

Right now the obj.name of the outer object shows up as a key in two places: 'hints' and 'object_keys'. If we want to stash the long name of the outer object then I think we have to add a object_longnames at the top level of the descriptor (e.g. peer with 'object_keys' and 'hints').

cjtitus commented 3 months ago

Finally coming back to this, thinking about the propagation of long_name. I think if a Component's long_name is None (or really, if it doesn't exist), the default behavior will be to join the Component's name with the parent's long_name (with space separation). This maximizes the utility of having a quasi-auto-generated display name that will be more read-able than the underscored name.

cjtitus commented 3 months ago

So, I have no idea why tests for python 3.9 immediately segfaulted, nor why python 3.10 was unable to even connect to github to check out the code, but I doubt it's anything I did, and I suspect that the tests would pass if re-run.

Anybody up for a review? @coretl @tacaswell

tacaswell commented 3 months ago

This will end up in the data_keys dictionaries so it should just flow through and you can use it if you look for it.

coretl commented 3 months ago

This will end up in the data_keys dictionaries so it should just flow through and you can use it if you look for it.

For the Signals I can see that this ends up in the descriptor. But what about the Device? It has a long name, but no describe() method, so how does its long name make it into the descriptor document?

In the call yesterday we discussed adding extra metadata (like long_name) to the hints of the Device so it ended up in the Descriptor. Does this mean the long_name of the Signal should also go in hints? Or should it go in the output of describe() as this PR has it?

cjtitus commented 3 months ago

This PR does not resolve the issue of describe for a Device being, in some sense, less complete than describe for a Signal. I think that putting long_name for Signal in describe() is the right choice, and it is up to a future PR to resolve the question of what to do with Device information that doesn't currently have a sensible place to live.

That could be @tacaswell 's suggestion of having a new top-level key in the Descriptors, or it could be some other place to stash the information, a special attribute signal that gets auto-created for Devices that holds long_name, etc. However, I think that none of those possible solutions impact this PR, so I think we should merge this PR, and then let people play with different ways to fix up Devices later.