Loading model using DGMR.from_pretrained("openclimatefix/dgmr") fails

agijsberts commented 5 months ago

Describe the bug

When loading the pretrained model from HuggingFace as per the instructions in README.md I get a KeyError: 'config' exception. The issue is that when following the instructions **model_kwargs will be empty in NowcastingModelHubMixin._from_pretrained, but it then attempts to pass **model_kwargs['config'] to the class constructor in https://github.com/openclimatefix/skillful_nowcasting/blob/8c40296e79b3a5849269e8da475b6f9f6fd1ac5f/dgmr/hub.py#L154

To Reproduce

>>> from dgmr import DGMR
>>> model = DGMR.from_pretrained("openclimatefix/dgmr")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/me/venv/dgmr/lib/python3.11/site-packages/huggingface_hub/utils/_validators.py", line 119, in _inner_fn
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/home/me/venv/dgmr/lib/python3.11/site-packages/huggingface_hub/hub_mixin.py", line 420, in from_pretrained
    instance = cls._from_pretrained(
               ^^^^^^^^^^^^^^^^^^^^^
  File "/home/me/venv/dgmr/git/skillful_nowcasting/dgmr/hub.py", line 154, in _from_pretrained
    model = cls(**model_kwargs["config"])
                  ~~~~~~~~~~~~^^^^^^^^^^
KeyError: 'config'

Expected behavior

I expect to have a variable model with the pretrained DGMR model.

Additional context

One can trivially work around the problem by passing an empty config argument when loading the pretrained model:

model = DGMR.from_pretrained("openclimatefix/dgmr", config={})

so maybe this is just a matter of updating README.md.

Still, I'd argue that the cleaner fix would be to pass the entire **model_kwargs to the class constructor, thus

model = cls(**model_kwargs)

That would also coincide with how HuggingFace themselves do it in https://github.com/huggingface/huggingface_hub/blob/855ee997202e003b80bafd3c02dac65146c89cc4/src/huggingface_hub/hub_mixin.py#L633.

agijsberts commented 5 months ago

Given that I was baffled that nobody seemed to have reported such a basic problem earlier, I investigated this issue further and, not surprisingly, it is a bit more complicated than I initially though.

It turns out the issue surfaced only recently due to changes introduced in huggingface_hub >= 0.22.0. So for the moment the best workaround is to downgrade huggingface_hub to version 0.21.4:

pip3 install huggingface-hub==0.21.4

Prior to https://github.com/huggingface/huggingface_hub/commit/45147c518ad3c1f70ecb462de4bf23cd553ba54b, config would be injected in the model_kwargs if the model class accepted a **kwargs argument. From version 0.22.0 onwards, the logic for when config is injected has changed . Looking at the code and the docs at https://huggingface.co/docs/huggingface_hub/main/en/guides/integrations#a-more-complex-approach-class-inheritance, I believe that config is injected if (a) the model class expects a parameter config in its __init__, and/or (b) the model class expects a parameter config in its _from_pretrained.

More relevantly though, if the model class accepts a **kwargs argument in its __init__, then the contents of the config are injected one by one in the **model_kwargs (as opposed to injecting the config as-is in **model_kwargs['config']). This setting actually seems more applicable to me in NowcastingModelHubMixin: wouldn't it be sufficient to replace model = cls(**model_kwargs['config']) with model = cls(**model_kwargs) in _from_pretrained? And wouldn't that also allow to simplify DGMR.__init__?

peterdudfield commented 4 months ago

Thanks @agijsberts for finding this. Should we pin the requirements file to huggingface-hub==0.21.4. Would you be interested in making a PR for this?

Would you be able to expand further on the the model_kwargs point? Perhaps even some links to the code where it could be improved?

agijsberts commented 3 months ago

Pinning huggingface-hub==0.21.4 is a sensible stopgap in my opinion. I'll prepare a PR right away.

Regarding the model_kwargs and without going into too much detail: a significant chunk of DGMR.__init__ is dedicated to dealing with a potential config parameter and, when passed, to use it to override the normal arguments. If I understood the workings of ModelHubMixin correctly, it should no longer be necessary to rely on such a "magic" config argument and removing the logic would make DGMR.__init__ agnostic to the mixin.

I volunteer to have a look at what would be the best way to resolve this.

peterdudfield commented 3 months ago

Thanks @agijsberts, yea if you were able to have a look at the DGMR.__init__ and how the best way to resolve this is, then that would be super

agijsberts commented 3 months ago

I have just pushed the possible __init__ simplification in my fork https://github.com/agijsberts/skillful_nowcasting/tree/refactor_huggingface-hub. This already includes tests to make sure that serialization-deserialization returns an identical object for DGMR, Discriminator, etc. Unfortunately that introduces a pip dependency on safetensors. In any case, after my holidays I want to check that predictions on some test data are still the same before turning this into a PR.

Perhaps someone can clarify what the reason is that DGMR needs a custom NowcastingModelHubMixin instead of using huggingface-hub's PyTorchModelHubMixin (as used by Discriminator, Sampler etc.). My impression is that both do the same thing (except that PyTorchModelHubMixin writes in safetensor format), but I'm probably missing something.

peterdudfield commented 3 months ago

Thanks @agijsberts - have a good holiday, and I look foreward to seeing the PR

If Im honest Im not totlaly sure why NowcastingModelHubMixin is used instead of PyTorchModelHubMixin, but ill have to have a look at it

openclimatefix / skillful_nowcasting