SNEWS2 / snewpy

A Python package for working with supernova neutrinos
https://snewpy.readthedocs.io
BSD 3-Clause "New" or "Revised" License
24 stars 17 forks source link

Update RegistryModel: making it a decorator instead of base class #287

Closed Sheshuk closed 7 months ago

Sheshuk commented 8 months ago

Closes #246

I suggest an improvement of the way we define the RegistryModel for each class.

What we have now

Currently for each model in snewpy.models.ccsn we define a class, inheriting from _RegistryModel. In the class variables we define params and _param_validator (and optionally _param_abbrv for the model input parameters. Then in the class __new__ method we have to:

  1. Define the documentation string, listing all the input parameters
  2. Run the validation of the input parameters
  3. Generate the filename based on the input parameters
  4. Populate the metadata dictionary from the input parameters
  5. Initialize and return the corresponding Loader class
  6. After all, format the documentation string based on possible values of the input parameters

Example: Sukhbold_2015

https://github.com/SNEWS2/snewpy/blob/d570f07e8b92f0993546cf28095beae29bd46d94/python/snewpy/models/ccsn.py#L182-L233

What is the problem

  1. These steps have to be performed (copied over and over) for each new model. It makes it not very straightforward to define the new model here, harder to read the code because of many copy-pasted parts) and more error-prone.

  2. A user calls a constructor of the object ccsn.Sukhbold_2015, but in the end gets an instance of loaders.Sukhbold_2015. This can be misleading.

What I suggest

In fact, only steps 3 and 5 are different for each model. The rest are completely defined by the input parameters, and can be automated.

I suggest making RegistryModel a decorator of the class, taking params and param_validator as arguments. The decorated class should inherit from the corresponding loader. In the __init__ function of the class user should define only the way to calculate the filename from the class and run the loader's init function.

So the full example for the Sukhbold_2015 would look like this:

@RegistryModel(
    progenitor_mass = [27., 9.6] * u.Msun,
    eos = ['LS220', 'SFHo']
)
class Sukhbold_2015(loaders.Sukhbold_2015):
    """Model based on simulations from Sukhbold et al., ApJ 821:38,2016. Models were shared privately by email.
    """
    def _metadata_from_filename(self, filename:str):
        metadata = {
            'Progenitor mass': float(filename.split('-')[-1].strip('z%.fits')) * u.Msun,
            'EOS': filename.split('-')[-2]
        }
        return metadata

    def __init__(self, progenitor_mass:u.Quantity, eos:str):
        if progenitor_mass.value == 9.6:
            filename = f'sukhbold-{eos}-z{progenitor_mass.value:3.1f}.fits'
        else:
            filename = f'sukhbold-{eos}-s{progenitor_mass.value:3.1f}.fits'
        return super().__init__(filename, self.metadata)

Outlook

Making ModelRegistry a decorator, aware of both model class and its allowed parameters, allows us a lot of flexibility.

  1. For example it would make it easy to fill a table of models with all their parameters - so a user can browse models with specific progenitor mass, or EOS (see #289)
  2. It will make implementing #243 easier
  3. Makes more straightforward for others to contribute their models. If the data are in recognized formats (i.e. there are no need for new loaders classes), adding a registry model is very straightforward with a minimum of repeating code.
JostMigenda commented 7 months ago

Thanks for the updates and responses!

Two more points on the Fornax_2022 model:

Sheshuk commented 7 months ago
  • I’d add _init_from_filename = False, since that model never had the deprecated behaviour.

Right! I forgot about this, thanks. Fixed.

Sheshuk commented 7 months ago

On a general note, I noticed that we may have been a bit inconsistent about how to treat e.g. eos in the argument list if there’s only one possible value for a model.

This is interesting question - what should we do with these parameters. I agree that having an argument with a single allowed value is not very useful. But on the other hand it's expressive - user can see the EOS assumed for a given model right away from the constructor's default value.

I was thinking, maybe we can use these "fixed" parameters as some additional way to store the metadata. If we implement #289 these input default parameters can be also used for querying - so a user can list all the models with the given EOS.

Of course, we can fill these in the metadata, but right now this happens at initialization (and sometimes even in the loaders classes), which means we can't make a table of possible values of metadata, without initializing all the possible models.

JostMigenda commented 7 months ago

I’ve commented at #289 with a few more ideas. It’d be good to reuse work from this PR for that; however, I think that’s starting to go far beyond the scope of this PR, so I’d suggest we table that for now.

While trying some example notebooks, I’ve noticed that the Model.param property is currently missing in this PR:

from snewpy.models.ccsn import Nakazato_2013

# SNEWPY v1.4
Nakazato_2013.param
> {'progenitor_mass': <Quantity [13., 20., 30., 50.] solMass>,
> 'revival_time': <Quantity [  0., 100., 200., 300.] ms>,
> 'metallicity': [0.02, 0.004],
> 'eos': ['LS220', 'shen', 'togashi']}

# This PR:
Nakazato_2013.param
> AttributeError: type object 'Nakazato_2013' has no attribute 'param'

A minimal fix would be to add the line param = pset to the definition of class c(base_class), resulting in:

# with minimal fix:
Nakazato_2013.param
> ParameterSet:
> *     progenitor_mass=Parameter(name="progenitor_mass", label="Progenitor mass", description="Mass of model progenitor in units Msun", values='[13. 20. 30. 50.] solMass')
> *     revival_time=Parameter(name="revival_time", label="Revival time", description="Time of shock revival in model in units ms", values='[  0. 100. 200. 300.] ms')
> *     metallicity=Parameter(name="metallicity", label="Metallicity", description="Progenitor metallicity", values='[0.02, 0.004]')
> *     eos=Parameter(name="eos", label="EOS", description="Equation of state", values='['LS220', 'shen', 'togashi']')

This works in principle, but some tweaks to the string representation of ParameterSet and/or Parameter classes would be useful.

Sheshuk commented 7 months ago

While trying some example notebooks, I’ve noticed that the Model.param property is currently missing in this PR

As I understand, this is also for a backward compatibility. I don't like to keep the attribute name param and I think we should remove it in the future.

For now I implemented a slightly extended fix (making param a class property returning dict as before, instead of changing the string representation of Parameter/ParameterSet). I added parameters class attribute to access ParameterSet

Sheshuk commented 7 months ago

Mori_2023 showed a weak spot of automatic parameter checking: if a parameter contains float values (like axion_coupling), the check if the given value is in the allowed values is problematic (0.199999998!=0.2). It needs rounding, as was done by Segev in #285, but here the checks are automated, so we need to implement precision in each Parameter.

So

  1. I added precision attribute to Parameter class, if it's provided, apply the rounding of input value before doing any checks
  2. When declaring RegistryModels parameters, we have a lot of repeating the param name, i.e. progenitor_mass = Parameter(name="progenitor_mass",...). So now the name argument is not needed.

Now I think this PR is ready for review again.

JostMigenda commented 7 months ago

Sorry for my delay in following up here. This all looks good, thanks!

As a final step, before I approve and merge this, I’ve gone through the model classes in snewpy.models.ccsn to double check that we didn’t accidentally introduce any changes to the argument lists, as discussed above in the eos example. Unfortunately, it seems there are a few more subtle changes:


It looks like the RegistryModel decorator currently turns ~all~ [edit: see note at the bottom] arguments into keyword-only arguments. For example:

@RegistryModel(
    progenitor_mass=[9, 10, 12, 13, 14, 15, 16, 19, 25, 60] * u.Msun,
)
class Fornax_2019(loaders.Fornax_2019):
    def __init__(self, cache_flux=False, *, progenitor_mass):
        # Note that `cache_flux` could be a positional argument here, as in previous versions of SNEWPY

# But the resulting signature, viewed e.g. by
from snewpy.models.ccsn import Fornax_2019
help(Fornax_2019)
# looks as follows:
>  __init__(self, filename: str = None, *, cache_flux=False, progenitor_mass)

(Another example: Previously, eos could be used as a positional argument e.g. in Tamborra_2014, so that Tamborra_2014("/path/to/snewpy/models/Tamborra_2014/s20.0c_3D_dir1", "LS220") worked (but was deprecated, of course). The decorator now enforces eos to be keyword-only, so this raises TypeError: too many positional arguments.)


Another, more minor one. Code snippet below is from Tamborra_2014, but same applies to the Walk_2018/9 classes.

@RegistryModel(
    progenitor_mass = [11.2, 20., 27.] * u.Msun,
    direction = [1,2,3],
    eos=['LS220']  # Note that 'eos' is present here …
)
class Tamborra_2014(loaders.Tamborra_2014):
    def __init__(self, *, progenitor_mass:u.Quantity, direction:int):  # … but missing here!
        # …

I’m not sure whether it’s intentional that eos is included in the decorator, but not in the __init__? In practice, using from snewpy.models.ccsn import Tamborra_2014; help(Tamborra_2014) we see that the decorator updates the signature for __init__ by adding filename and eos, but it doesn’t add eos to the docstring. This can be fixed by explicitly adding eos:str='LS220' to the definition of __init__.


One more issue with handling of positional arguments. The following example worked in SNEWPY v1.4 and now fails:

OConnor_2013("/path/to/snewpy/models/OConnor_2013/", 12)  # using positional arguments for `base` and `mass`
> TypeError: RegistryModel.<locals>._wrap.<locals>.c.__init__() takes 1 positional argument but 3 were given

This doesn’t appear to be quite the same issue as when using eos as positional argument in Tamborra_2014 (above). OConnor_2013 is a special case anyway, since it currently has a tar file instead of individual files per progenitor; I wonder if that complicates matters here?

_[Edited to add: Turning all arguments keyword-only appears to only happen for classes that use the deprecated filename argument. For example, OConnor_2013 has set _init_from_filename=False and the base and mass arguments aren’t affected.]_

JostMigenda commented 7 months ago

Hm, one more thing: In Kuroda_2020, the mass and eos arguments are marked as @deprecated, but I’m not getting a warning when I use them?

>>> from snewpy.models.ccsn import Kuroda_2020
>>> import astropy.units as u
>>> Kuroda_2020(mass=20, rotational_velocity=0*u.rad/u.s, magnetic_field_exponent=0)
Kuroda_2020 Model: LnuR00B00.dat
EOS              : LS220
Progenitor mass  : 20.0 solMass
Rotational velocity : 0.0 rad / s
B_0 Exponent     : 0
Sheshuk commented 7 months ago

Hi Jost, thanks for this thorough check!

I didn't cross-check the positional/keyword arguments. I assumed that since in most cases we had (filename, *, other_params), so I moved everything to keyword. But it's easy to keep the initial signature from the class constructor, just adding the filename as first positional argument.

I’m not sure whether it’s intentional that eos is included in the decorator, but not in the __init__?

Yes, that is intentional: the constructor in the decorated class should be minimal (keeping only the arguments needed to produce the filename). And all the other "fixed" arguments will be added, if they are present in the decorator parameters. I did this in view of our discussion in #289, so if one wants to add a fixed parameter to a model, he just needs to define it once in the RegistryModel decorator - like in these examples you mention.

But missing docstring is a bug, I'll work on that.

Sheshuk commented 7 months ago

Hm, one more thing: In Kuroda_2020, the mass and eos arguments are marked as @deprecated, but I’m not getting a warning when I use them?

This is because python by default ignores DeprecationWarnings if they're not from __main__. I switched to issuing a UserWarning, so it is shown now.

Sheshuk commented 7 months ago

Hopefully now it's fixed. I tried to separate the legacy (backward compatibility) parts, so it will be easier to remove when we get to v2.0.

Sheshuk commented 7 months ago

@JostMigenda thanks a lot for your help and checking of all these cases! I missed these problems, and they really could break some of the users code. I hope this works now.