glue-viz / glue

Linked Data Visualizations Across Multiple Files
http://glueviz.org
Other
721 stars 152 forks source link

Allow serializing and deserializing named astropy units #2475

Closed Carifio24 closed 2 months ago

Carifio24 commented 5 months ago

This PR adds serialization support for astropy NamedUnits. The immediate use case for this is storing the decay unit in https://github.com/glue-viz/glue-wwt/pull/103. This seems to work well for me, and it looks like astropy will give the builtin instance after deserialization, e.g.

import astropy.units as u
assert u.Unit("m") is u.m  # Passes

However, if someone more astropy-savvy than myself knows of a better way that we should do this, please let me know!

Carifio24 commented 5 months ago

@dhomeier Thanks for the review! My main motivation for ensuring exact identity was making sure that unit conversions would work correctly. I wasn't really sure what sorts of unit classes were out there so I didn't want to overreach and cause something to break in the future on some unit class that I hadn't prepared for.

I wasn't aware of CompositeUnit but after some experimentation it seems like that's not a concern there, regular equality is fine. I've updated things to match your suggestions.

dhomeier commented 5 months ago

Admittedly I have not checked all examples of exotic units one might encounter here. There could be problems with

  1. Units that are only available after enabling a specific system, like u.mile in u.imperial.
  2. vounit format on top of that even allows defining arbitrary custom units that could shadow standard units.

Currently those cases would probably simply fail on creating u.Unit("furlong") in the loader, but perhaps it makes more sense to add a sanity check already in the saver:

# Check that unit can be parsed back with default enabled systems
try:
    with u.set_enabled_units([u.si, u.cgs, u.astrophys]):
        saved_unit = u.Unit(unit.to_string())
except ValueError:
    raise GlueSerializeError(f"Serializing units of '{unit}' is not yet supported")

(a bit complicated since some custom system might already have been added in the saver context, but not be available when reading back).

Carifio24 commented 5 months ago

@dhomeier I think the sanity check is a good idea. I'm trying to imagine what it would look like for a plugin/config to overwrite this kind of behavior in a straightforward way.

I'm wondering if it would be better to give a warning rather than raise an error. It seems like non-default units (e.g. the units in imperial) aren't generally distinguishable by their type (which is what the glue serializer/deserializer use for dispatch), so I'm trying to think what would be a good way to easily overwrite/extend the serialization behavior for these units without having to do something like the following, which feels clunky:

original_loader = GlueUnSerializer.dispatch[UnitBase][0]
@loader(UnitBase)
def _updated_unit_loader(rec, context):
    if is_one_of_my_non_default_units(rec["unit_base"]):
        # Return my custom loaded unit
    return original_loader(rec, context)

and similarly in the saver. Regardless, I think it's going to be incumbent on anything that implements a custom saver to implement a custom loader, and vice versa.

Or maybe I'm overthinking things and we should just raise an error as you suggested.

Carifio24 commented 4 months ago

@dhomeier I've updated this to make the changes that you suggested.