glue-viz / glue

Linked Data Visualizations Across Multiple Files
http://glueviz.org
Other
739 stars 153 forks source link

Roadmap for unit support in glue/glue-jupyter #2295

Open astrofrog opened 2 years ago

astrofrog commented 2 years ago

Background

Component objects can have units set in the units attribute, which are currently not actually used anywhere.

Adding support for units covers several distinct parts of glue which I describe here:

Linking

Links currently ignore units, in particular LinkSame. A first step towards unit support would be to make sure that either LinkSame understands units, or we create a new class that does (we need to make sure we don't break old session files when implementing this). As a first step for experimenting with this, we could always create a link class that does this specifically with astropy.units and we could even have a link class for specifically dealing with the spectral axes (which requires the spectral() equivalency). This could be implemented in the core glue package because even though it would use astropy, it is pretty generic.

With this in place, it would then already be possible for instance to use the Scatter viewer to show e.g. two spectra in different units and have them be shown correctly.

Display units

Regardless of the underlying units of the data, it might be valuable to have viewers be able to offer the option to users to select what units they want to show the data in. This will require adding options for display units on the viewer class and then converting the units on-the-fly in the layer artists.

Underlying/default data units

It might also be desirable to be able to change the default units of a component so that for all intents and purposes it acts as if it was in a different unit. We can do this without really changing the underlying data buffer but we need to be careful because all viewers need to then properly implement support for the NumericalDataChangedMessage message (it would be a good way to debug it). This is perhaps the lowest priority of the four areas mentioned here.

Profile viewer

The profile viewer is a trickier viewer to add unit support to, because historically it has been able to show data together which aren't linked and are in different units (each layer can have its own attribute for the y axis which makes it hard to define what the 'true' y axis unit should be). I think that for this the easiest solution might be to initially just add a viewer-level attribute which is the y-axis unit and default that to None which allows any data to be shown on the y axis, but then have an option to set the unit, perhaps using a dropdown populated with equivalent units from all layers, and then disabling incompatible layers.

dhomeier commented 2 years ago

Perhaps related to •3, should Component look for Quantity-like inputs to data to set default units on __init__ (basically so Component(np.arange(6) * u.micron) would work, and by inference, Data(...))?

astrofrog commented 2 years ago

Yes that would be nice!

dhomeier commented 2 years ago

I have pushed a first working implementation based on #2294 (along with basic equivalencies support) to my local fork / astrofrog/glue#7, which opens a few possible paths for getting the attached units through add_component:

  1. let autotype pass asanyarray(data) instead of asarray(data) and have Component probe for unit properties on __init__.
  2. search for usable unit properties in autotype and pass those to the units kwarg.

The former has the advantage of covering all applicable Component subclasses, but would not work on a direct initialisation (of course the respective mechanics from 2 could still be duplicated there). But the first approach also raises the issue how to proceed with a quantity-like data input to Component (which can already happen on explicit initialisation). In my current implementation I have picked the more conservative approach of falling back to data.value and let the units still be treated explicitly by _units_scale etc. Much of that might be more conveniently done with data already supporting units, but I am worried about conflicts for cases of non-ndarray-based data.

mrclary commented 2 years ago

@astrofrog, have you already considered using the pint package? It already has integration with numpy, matplotlib, and pandas.

dhomeier commented 2 years ago

pint does not seem to support equivalencies, which would make it difficult to use interchangeably with something like the astropy.units framework.