Closed Sheshuk closed 7 months ago
Agreed; this would be a nice feature!
The more I think about this, the more I like this idea; and I’m starting to think there is a much broader set of interesting queries than I initially realized. Just a few more examples:
(Okay, the last one is perhaps more of a stretch goal …)
Right now, none of these queries would be possible to do with the model initialisation parameters; and adding all this additional metadata in the __init__
signature (even though just 1–2 arguments typically need to be provided by the user) would be overwhelming. This suggests to me that we need some kind of separation between “all metadata” and “initialisation parameters”.
Regarding implementation: With the work in #287, I think it’s fairly straightforward to iterate over all model classes and use RegistryModel.get_param_combinations()
to fill a data frame with individual models? Would it perhaps make sense to add other metadata as additional arguments to the RegistryModel
decorator, but exclude them from the __init__
?
Yes, that are the use cases I also had in mind. For the implementation: I agree that we shouldn't put all additional fixed values to the init arguments. The solution with some dictionary with "static metadata" in the class will work. But there is also a case when metadata is derived based on the input arguments - for example, in Fornax_2022 or Mori models the PNS mass is derived from progenitor mass. And it would be good if use can query on these derived parameters as well.
So probably the most flexible solution would be to have a class method 'derive_metadata(**parameters) - > dict' returning the metadata dictionary. This way we can get all additional parameters without initializing the models.
We just need to make sure that our queries return the instruction to initialize the given model, i. e. the init parameters separately from the derived/static ones.
Hmm … though iirc, for Fornax_2022
the PNS mass is in the .h5 file, so there's no way to collect that information without downloading and reading from all those files?
I guess a backup option—especially if there are other bits of metadata that aren't easy to collect at runtime—might be to preconpute this large table once as part of the release workflow and distribute a separate file with that metadata table as part of the package.
Would it perhaps make sense to add other metadata as additional arguments to the
RegistryModel
decorator, but exclude them from the__init__
?
Sorry, I missed this question. Probably you mean something like this:
@RegistryModel(
progenitor_mass = [40 * u.Msun],
eos=['LS220']
)
class OConnor_2015(loaders.OConnor_2015):
"""Model based on the black hole formation simulation in `O'Connor (2015) <https://arxiv.org/abs/1411.7058>`_.
"""
def __init__(self, progenitor_mass:u.Quantity):
In current implementation of #287 it automatically adds the eos
input parameter to the constructor, and the default values, so that they don't have to be listed in the class __init__
.
So the signature for this model becomes
Init signature:
ccsn.OConnor_2015(
filename: str = None,
*,
progenitor_mass: astropy.units.quantity.Quantity = <Quantity 40. solMass>,
eos='LS220',
)
and to add more static metadata one just needs to put more parameters in the RegistryModel
decorator.
Maybe this will be enough for a first implementation? Build a table just from the init parameters. Explicit is better than implicit, and so on.
Hmm … though iirc, for
Fornax_2022
the PNS mass is in the .h5 file, so there's no way to collect that information without downloading and reading from all those files?
Oh, right. It's in Mori's model PNS mass is retrieved just from a lookup dict. So maybe we shouldn't consider this parameter at all.
I guess a backup option—especially if there are other bits of metadata that aren't easy to collect at runtime—might be to preconpute this large table once as part of the release workflow and distribute a separate file with that metadata table as part of the package.
I don't like this option, because it means that such table would be "frozen" to a release, and make it harder to do development, e.g. when you add a new model class you need a to run a special step to regenerate the lookup table. It's a valid plan if we really need to query all the secondary parameters, but I would prefer to start with just browsing the init arguments.
It would be nice for a user to have a possibility to query our ModelRegistry for models with specific parameters, e.g. get a list of models which use
EOS='LS220'
and haveprogenitor_mass=15 Msun
.I think it's rather simple to implement, since we are listing the allowed parameter combinations for each model. For example, it can be implemented as pandas.DataFrame, which lists all possible parameters for all registry models.