Make a table of models with allowed parameters, which user can query

SNEWS2 / snewpy

A Python package for working with supernova neutrinos

https://snewpy.readthedocs.io

BSD 3-Clause "New" or "Revised" License

26 stars 19 forks source link

Make a table of models with allowed parameters, which user can query #289

Closed Sheshuk closed 7 months ago

Sheshuk commented 11 months ago

It would be nice for a user to have a possibility to query our ModelRegistry for models with specific parameters, e.g. get a list of models which use EOS='LS220' and have progenitor_mass=15 Msun.

I think it's rather simple to implement, since we are listing the allowed parameter combinations for each model. For example, it can be implemented as pandas.DataFrame, which lists all possible parameters for all registry models.

JostMigenda commented 11 months ago

Agreed; this would be a nice feature!

JostMigenda commented 10 months ago

The more I think about this, the more I like this idea; and I’m starting to think there is a much broader set of interesting queries than I initially realized. Just a few more examples:

Get a list of models that form a BH
Get a list of 3D simulations
Get a list of all models that include separate mu/tau neutrino fluxes (currently empty; but it would be nice to have at least one such model to demo that ability, once we add it)
Get models where the exact same progenitor was simulated by different codes (FORNAX/GR1D/PROMETHEUS-VERTEX/…)

(Okay, the last one is perhaps more of a stretch goal …)

Right now, none of these queries would be possible to do with the model initialisation parameters; and adding all this additional metadata in the __init__ signature (even though just 1–2 arguments typically need to be provided by the user) would be overwhelming. This suggests to me that we need some kind of separation between “all metadata” and “initialisation parameters”.

Regarding implementation: With the work in #287, I think it’s fairly straightforward to iterate over all model classes and use RegistryModel.get_param_combinations() to fill a data frame with individual models? Would it perhaps make sense to add other metadata as additional arguments to the RegistryModel decorator, but exclude them from the __init__?

Sheshuk commented 10 months ago

Yes, that are the use cases I also had in mind. For the implementation: I agree that we shouldn't put all additional fixed values to the init arguments. The solution with some dictionary with "static metadata" in the class will work. But there is also a case when metadata is derived based on the input arguments - for example, in Fornax_2022 or Mori models the PNS mass is derived from progenitor mass. And it would be good if use can query on these derived parameters as well.

So probably the most flexible solution would be to have a class method 'derive_metadata(**parameters) - > dict' returning the metadata dictionary. This way we can get all additional parameters without initializing the models.

We just need to make sure that our queries return the instruction to initialize the given model, i. e. the init parameters separately from the derived/static ones.

JostMigenda commented 10 months ago

Hmm … though iirc, for Fornax_2022 the PNS mass is in the .h5 file, so there's no way to collect that information without downloading and reading from all those files?

I guess a backup option—especially if there are other bits of metadata that aren't easy to collect at runtime—might be to preconpute this large table once as part of the release workflow and distribute a separate file with that metadata table as part of the package.

Sheshuk commented 10 months ago

Would it perhaps make sense to add other metadata as additional arguments to the RegistryModel decorator, but exclude them from the __init__?

Sorry, I missed this question. Probably you mean something like this:

@RegistryModel(
    progenitor_mass = [40 * u.Msun],
    eos=['LS220']
)
class OConnor_2015(loaders.OConnor_2015):
    """Model based on the black hole formation simulation in `O'Connor (2015) <https://arxiv.org/abs/1411.7058>`_.
    """
    def __init__(self, progenitor_mass:u.Quantity):

In current implementation of #287 it automatically adds the eos input parameter to the constructor, and the default values, so that they don't have to be listed in the class __init__.

So the signature for this model becomes

Init signature:
ccsn.OConnor_2015(
    filename: str = None,
    *,
    progenitor_mass: astropy.units.quantity.Quantity = <Quantity 40. solMass>,
    eos='LS220',
)

and to add more static metadata one just needs to put more parameters in the RegistryModel decorator.

Maybe this will be enough for a first implementation? Build a table just from the init parameters. Explicit is better than implicit, and so on.

Sheshuk commented 10 months ago

Hmm … though iirc, for Fornax_2022 the PNS mass is in the .h5 file, so there's no way to collect that information without downloading and reading from all those files?

Oh, right. It's in Mori's model PNS mass is retrieved just from a lookup dict. So maybe we shouldn't consider this parameter at all.

I guess a backup option—especially if there are other bits of metadata that aren't easy to collect at runtime—might be to preconpute this large table once as part of the release workflow and distribute a separate file with that metadata table as part of the package.

I don't like this option, because it means that such table would be "frozen" to a release, and make it harder to do development, e.g. when you add a new model class you need a to run a special step to regenerate the lookup table. It's a valid plan if we really need to query all the secondary parameters, but I would prefer to start with just browsing the init arguments.