Model registry jan22 - Githubissues

sybenzvi commented 2 years ago

Enable instantiation of models without explicit reference to classes or model filenames in user code. Addresses issue #177.

JostMigenda commented 2 years ago

Since I’ve already been quite involved in the discussion in #177, I’d be happy to review this once it’s ready. However, just to give you a heads up—I will have very little time to work on SNEWPY through the rest of February and would probably not be able to get to this before March.

Sheshuk commented 2 years ago

Another thought - maybe it's worth making the registry a class ModelRegistry, which would store the data_folder as the member field? That way a user can work on the default location (astropy.get_cache_dir):

from snewpy.models import ccsn
#`load` uses a `registry._model_registry` - a registry with default location
model = ccsn.Nakazato_2013.load(**params)

but work on other registries if needed to, for example, using contextmanager:

from snewpy.models import ccsn
from snewpy.registry import registry #this is the contextmanager
with registry("~/snewpy/my_models"): 
    #on enter() it sets `registry._model_registry` to this new path
    model = ccsn.Nakazato_2013.load(**params)
#on exit() restore the default registry location

This would be a good place for any further improvement of the registry, like autocompletion, or reading the configuration file with all the models etc.

JostMigenda commented 2 years ago

I just had a chat with @joesmolsky who works on a SN code for KamLAND and it sounds like this PR is exactly what he’s looking for. Joe, can you test this out and see if this works for you?

joesmolsky commented 2 years ago

This does look like what I am looking for. However, trying the example from the init_model() docstring doesn't work for me.

In ccsn.py on line 110, the check_valid_params() function is called without the required model positional argument.

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
/var/folders/bs/t_j9r03n3h39lmjyz9hr6b540000gn/T/ipykernel_18432/4209547403.py in <module>
----> 1 nak = init_model('Nakazato_2013', progenitor_mass=13*u.Msun, metallicity=0.004, revival_time=0*u.s, eos='shen')

~/src/snewpy/python/snewpy/models/registry.py in init_model(model_name, download, download_dir, **user_param)
     59 
     60     try:
---> 61         return getattr(module, model_name)(**user_param)
     62     except FileNotFoundError as e:
     63         logger = logging.getLogger()

~/src/snewpy/python/snewpy/models/base.py in _wrapper(self, *arg, **kwargs)
     16     @wraps(init)
     17     def _wrapper(self, *arg, **kwargs):
---> 18         init(self, *arg, **kwargs)
     19         check(self)
     20     return _wrapper

~/src/snewpy/python/snewpy/models/ccsn.py in __init__(self, filename, progenitor_mass, revival_time, metallicity, eos)
    108             user_params = dict(zip(self.param.keys(), (progenitor_mass, revival_time, metallicity, eos)))
    109             try:
--> 110                 check_valid_params(**user_params)
    111             except ValueError as e:
    112                 raise e

TypeError: check_valid_params() missing 1 required positional argument: 'model'

I also get this error using the Nakazato_2013 docstring example.

JostMigenda commented 2 years ago

What we (@sybenzvi @sgriswol @Sheshuk and I) discussed just now at the Hackathon was, to split all model classes up into “registry classes” (e.g. Bollig_2016) which check physics parameters, create file names from them and ensure the files are downloaded and “loader classes” (e.g. _GarchingArchiveModel) which only deal with loading the file from disk and providing access to the flux data.

The previous commit splits up the Nakazato_2013 class as an example. Calling the snewpy.models.ccsn.Nakazato_2013 (registry) class returns an instance of the snewpy.models.loaders.Nakazato_2013 (loader) class, which is maybe a bit surprising, but is a reasonably pragmatic design imho.

JostMigenda commented 2 years ago

Just to note: In snewpy.snowglobes the two functions generate_time_series and generate_fluence currently take a file name/path and the name of the corresponding class (as a string) as arguments. Those two functions will need to be updated as well, in line with our changes here. Instead of

model = "Zha_2021"
model_path = f"{model_dir}/{model}/s20.dat"
generate_time_series(model_path, model, *args)

we could have something like

model = init_model(model=Zha_2021, progenitor_mass=20)
generate_time_series(model, *args)

That would break any existing code that uses those functions; just like these changes to the model classes break some existing code. So we should keep all this for a large SNEWPY 2.0 update, I guess?

sgriswol commented 2 years ago

I have a few small items I would like to resolve before this branch is merged. The following is a crude checklist and a few thoughts to help organize these items.

[x] Fix Warren_2020 Import by changing the download target to the tarballs (Not all a1.23, single progenitor mass model files are available directly from Zenodo)
[x] Clean up the model properties and metadata (After the hackathon, some properties of some models were moved to metadata, but this is inconsistent)
[x] Fix auto-filling docstring for Garching group models
[x] Update Documentation to show the new registry interface (Add example to 1st page readthedocs)

One larger problem was discussed with @JostMigenda and @sybenzvi (Could be added as a new issue)

[x] Backwards compatibility with previous models.
- Rather than having a filename in the ccsn.model init. causing an error, could this just cause a deprecation warning?
- The current example notebooks for importing models could be used as "pseudo-unit-tests" in that these notebooks still initialize from filename. After adding deprecation warnings, these notebooks could be used to identify model initialization issues.

JostMigenda commented 2 years ago

Re: backwards compatibility: I realise this would pretty much mean having an initialiser like __init__(self, filename=None, *, progenitor_mass=None, revival_time=None, metallicity=None, eos=None) in the registry (ccsn.Name_YYYY) classes; which we all agree is not nice. Still, as a temporary solution in “SNEWPY v1.3” (that we then rip out in v2.0) I think it would be better than breaking people’s existing code without warning.

sybenzvi commented 2 years ago

Re: backwards compatibility: I realise this would pretty much mean having an initialiser like __init__(self, filename=None, *, progenitor_mass=None, revival_time=None, metallicity=None, eos=None) in the registry (ccsn.Name_YYYY) classes; which we all agree is not nice. Still, as a temporary solution in “SNEWPY v1.3” (that we then rip out in v2.0) I think it would be better than breaking people’s existing code without warning.

I agree, this isn't nice but some kind of warning message is needed to help people make the transition.

sgriswol commented 2 years ago

These most recent changes restore import-by-filename functionality and add deprecation warnings to the ccsn models. The check to issue a deprecation warning is implemented via a decorator on the __new__ methods. When we're ready to remove that functionality, only a few lines will need to be removed from each model. Let me know if more changes are needed. Otherwise, I think this branch is nearly ready to be merged.

sgriswol commented 2 years ago

The tests are failing due to the Warren_2020 models, however the reason seems to be somewhat inconsistent. Yesterday (09/01), it appears that some issues with Zenodo were causing downloads to fail. Running the tests locally, downloading the Warren_2020 tarballs took too long and the connection timed out. My connection was stable. A build job I ran yesterday also resulted in failure, but the download appeared to complete. A JSONDecodeError was thrown at _model_downloader.py:130, suggesting the model file may not have downloaded correctly.

Today (09/02) the downloads succeed both locally and on build jobs, but the build job tests fail while my local tests pass. During the build job, a KeyError is thrown at _model_downloader.py:133 based on a missing 'files' key from a requests response json. Running locally (same Python & requests version), I don't see this error. Another set of eyes on this would be appreciated.

I've attached the raw logs from both of the build jobs I mentioned. registry_PR_workflow_090122.log registry_PR_workflow_090222.log

JostMigenda commented 2 years ago

I just finished looking through the existing Jupyter notebooks and found there are still a few issues with backwards compatibility:

[x] Bollig_2016, OConnor_2015, Tamborra_2014, Walk_2018, Walk_2019: extra positional eos argument (e.g. model = Bollig_2016(ifile, 'LS220')) gives a TypeError
[X] Fornax_2021: backward compatibility code ignored the cache_flux keyword argument ➔ now fixed
[x] OConnor_2013: extra positional arguments (models[mass] = OConnor_2013(base,mass,EOS)) give a TypeError
[x] Warren_2020: _model_downloader.from_zenodo() raises a MissingFileError: stir_multimessenger_a1.23_m10.0.h5
[x] Zha_2021: Plotting code in the notebook accesses the attribute model.EOS, which doesn’t exist anymore. (We removed it at the beginning of the hackathon.) Since it was an implementation detail of some model classes that was never documented, I’d be okay replacing that with model.metadata["EOS"]?

Other notes:

From documentation, I think a FutureWarning would be better than a DeprecationWarning. Doesn’t make a difference when users are initialising the model manually e.g. in a notebook, but DeprecationWarnings are hidden by default if users e.g. write their own module with plotting functions, initialise models in that module and then call code in that module from a notebook. I’ve also tweaked the wording of the warning a bit.

Regarding tests: I also notice that the automated tests on the recent commits took up to ~40 min to run; and much of that appears to be time to download the large Warren_2020 files from Zenodo. Even if it’s normally a bit faster, I worry that it severely diminishes the usefulness of an automated test suite if it takes so long to point out any problems.

JostMigenda commented 1 year ago

@sgriswol and I just had a long conversation about this PR. Brief summary of key points:

[x] Most of the remaining incompatibilities listed above are fairly straightforward to fix
[x] For the Warren_2020 model, let’s change that back from zenodo to github in the model_files.yml for now. We can look at the zenodo-specific issues (missing files for some progenitor masses, download issues during the tests) separately; they don’t need to hold up this whole PR.
[x] Since the model registry downloader is completely separate from the existing snewpy.get_models() downloader, we could reset the default download location for the latter to ., as it was before. (At least for now; we can revisit whether/how to merge those two downloaders in a separate PR later.)
[x] _model_urls.py should only be populated during the publish workflow; in the repo, the file should be empty. (Maybe add a unit test to double-check it doesn’t get committed mistakenly?)

JostMigenda commented 1 year ago

Awesome; thanks a lot @sgriswol!

From my testing, this fixes all the issues listed earlier. Now, there are a couple of remaining bits:

fixing the Zenodo downloader
The init_model convenience function is currently hidden in snewpy.models.util, which is (a) not included in the docs and (b) probably not a submodule that we’d want to expose to the user. There’s also Andrey’s comment about whether to require e.g. "ccsn.Nakazato_2013" as an argument instead of just "Nakazato_2013".
update Jupyter notebooks and “Getting Started” page of documentation with new way of initialising models
Clean up the models in ccsn, base and loader. (E.g. unused _RegistryModel base class, duplicate code in loader classes)

However, I think none of these urgently need to be part of this PR. Instead, I’d suggest merging this PR as is and releasing a new beta (v1.3b2) so more people can try this out and we can check that there are no issues we’ve missed. We can then open a separate issue for the bullet points above and decide which of them need to go in urgently for v1.3 and which ones are less urgent.

Sheshuk commented 1 year ago

Thanks for the great work on this! But may I ask why my review and suggestions were completely ignored in this PR? I don't mean implementing everything I suggest and discussing in details, but there was not a single reaction - why do we need the code review then?

JostMigenda commented 1 year ago

Of the four comments in your review, I think two (regarding valid parameter combinations and separating inits from file name/parameters) were resolved/obsoleted by changes during & since the hackathon; I’ve moved the third (re: using "ccsn.Bollig_2016" instead of just "Bollig_2016" in init_model) to #225 and we can discuss the fourth (re: moving check_param_values into the model class(es)) as part of #224 (I’ve just edited that to add a link to your comment; sorry I missed that initially!)

In #224, I’ve also included cleaning up duplicate code in loader classes, which I remember you pointed out in a discussion during the hackathon. Did I miss any other comments or suggestions?

Sheshuk commented 1 year ago

I'm very grateful for taking my suggestions into account. I'm talking about the communication here.

were resolved/obsoleted by changes during & since the hackathon

If they get resolved or obsolete, one should a)write a comment about it or b)at least hit Resolve conversation button and c) request the review again (optional). I get no response at all. Also my suggestions (in particular #225) were for this particular PR, before merging it to main. And then you come and make decision to merge this.

As remote collaborators, we have the mechanisms here on github as our main mean for communication, and to be honest this is frustrating and makes my work hard. Maybe we should establish in a more formal way, how to collaborate? I suppose I misinterpreted the way we should be doing the code review (by coincidence this practice was suggested exactly when I joined the snewpy development) - I thought that I'm supposed to be doing the reviews along with others. But since there is clearly an hierarchy of the users here, maybe you should formalize it by setting Roles on the repository with permissions for the review? So I don't get mistaken and don't waste my time on things that get ignored/rejected.

JostMigenda commented 1 year ago

Andrey, I don’t feel your comment accurately represents our communication here.

If they get resolved or obsolete, one should a)write a comment about it or b)at least hit Resolve conversation button and c) request the review again (optional).

I did mark the first two points as resolved. (As far as I can tell, GitHub doesn’t display when I did this; but from memory, it was while I was writing my earlier comment reviewing the remaining bits. I did not think it was necessary to write a comment there, because I remember we discussed both points on Zoom calls with you during the hackathon. (However, if GitHub did not send you notifications when I marked those conversations as resolved—I assumed it does send out notifications for that?!—, I can keep that in mind and add explicit comments in the future.)

What we can probably learn from this for next time is that even if we have discussions on Zoom, we should make sure to put key points in writing in GitHub as well. That is a fair point and I’ll try to remember that in the future.

Also my suggestions (in particular https://github.com/SNEWS2/snewpy/issues/225) were for this particular PR, before merging it to main. And then you come and make decision to merge this.

I did write a comment on Tuesday listing remaining points (explicitly mentioning your suggestion that’s now in #225) and saying that I’d suggest to merge this now and move those remaining points to new issues. I then waited for two days—during which you were actively working on snewpy, as seen in other Issues/PRs—and only merged this when there was no response.

Sheshuk commented 1 year ago

I did mark the first two points as resolved.

You're right, I'm sorry, now I see them resolved and I didn't get any notifications. The other two, however, are open - so the status of this PR is changes requested while you approve it. So I ask you in such situation in future to comment directly, something like moving this suggestion to new issue, or any other reason when you consider the question resolved.

I then waited for two days—during which you were actively working on snewpy, as seen in other Issues/PRs—and only merged this when there was no response

I was working on other things, and was not going to change context: this PR is huge, and since as I understand it is was mostly discussed offline, to catch up I need to read the code again. If your suggestion was a question for me, you should have asked me directly or re-requested the review, and not just ignore the old one. I think this is needed because there is an open review, so I'm a part of this discussion.

In general what I asked here is a fair communication - and github provides you with the means for that.

If there is an unresolved review it must be resolved, before PR can be merged.
If there is an offline discussion (with some decisions made), the people in the online, participating in the discussion, should get notified.
Procedures should be clear and fair for all collaborators.

SNEWS2 / snewpy

Model registry jan22 #184