Open agijsberts opened 5 months ago
Given that I was baffled that nobody seemed to have reported such a basic problem earlier, I investigated this issue further and, not surprisingly, it is a bit more complicated than I initially though.
It turns out the issue surfaced only recently due to changes introduced in huggingface_hub >= 0.22.0
. So for the moment the best workaround is to downgrade huggingface_hub
to version 0.21.4
:
pip3 install huggingface-hub==0.21.4
Prior to https://github.com/huggingface/huggingface_hub/commit/45147c518ad3c1f70ecb462de4bf23cd553ba54b, config
would be injected in the model_kwargs
if the model class accepted a **kwargs
argument. From version 0.22.0
onwards, the logic for when config
is injected has changed . Looking at the code and the docs at
https://huggingface.co/docs/huggingface_hub/main/en/guides/integrations#a-more-complex-approach-class-inheritance, I believe that config
is injected if (a) the model class expects a parameter config
in its __init__
, and/or (b)
the model class expects a parameter config
in its _from_pretrained
.
More relevantly though, if the model class accepts a **kwargs
argument in its __init__
, then the contents of the config are injected one by one in the **model_kwargs
(as opposed to injecting the config as-is in **model_kwargs['config']
). This setting actually seems more applicable to me in NowcastingModelHubMixin
: wouldn't it be sufficient to replace model = cls(**model_kwargs['config'])
with model = cls(**model_kwargs)
in _from_pretrained
? And wouldn't that also allow to simplify DGMR.__init__
?
Thanks @agijsberts for finding this. Should we pin the requirements file to huggingface-hub==0.21.4
. Would you be interested in making a PR for this?
Would you be able to expand further on the the model_kwargs
point? Perhaps even some links to the code where it could be improved?
Pinning huggingface-hub==0.21.4
is a sensible stopgap in my opinion. I'll prepare a PR right away.
Regarding the model_kwargs
and without going into too much detail: a significant chunk of DGMR.__init__
is dedicated to dealing with a potential config
parameter and, when passed, to use it to override the normal arguments. If I understood the workings of ModelHubMixin
correctly, it should no longer be necessary to rely on such a "magic" config
argument and removing the logic would make DGMR.__init__
agnostic to the mixin.
I volunteer to have a look at what would be the best way to resolve this.
Thanks @agijsberts, yea if you were able to have a look at the DGMR.__init__
and how the best way to resolve this is, then that would be super
I have just pushed the possible __init__
simplification in my fork https://github.com/agijsberts/skillful_nowcasting/tree/refactor_huggingface-hub. This already includes tests to make sure that serialization-deserialization returns an identical object for DGMR
, Discriminator
, etc. Unfortunately that introduces a pip dependency on safetensors
. In any case, after my holidays I want to check that predictions on some test data are still the same before turning this into a PR.
Perhaps someone can clarify what the reason is that DGMR
needs a custom NowcastingModelHubMixin
instead of using huggingface-hub
's PyTorchModelHubMixin
(as used by Discriminator
, Sampler
etc.). My impression is that both do the same thing (except that PyTorchModelHubMixin
writes in safetensor
format), but I'm probably missing something.
Thanks @agijsberts - have a good holiday, and I look foreward to seeing the PR
If Im honest Im not totlaly sure why NowcastingModelHubMixin
is used instead of PyTorchModelHubMixin
, but ill have to have a look at it
Describe the bug
When loading the pretrained model from HuggingFace as per the instructions in
README.md
I get aKeyError: 'config'
exception. The issue is that when following the instructions**model_kwargs
will be empty inNowcastingModelHubMixin._from_pretrained
, but it then attempts to pass**model_kwargs['config']
to the class constructor in https://github.com/openclimatefix/skillful_nowcasting/blob/8c40296e79b3a5849269e8da475b6f9f6fd1ac5f/dgmr/hub.py#L154To Reproduce
Expected behavior
I expect to have a variable
model
with the pretrainedDGMR
model.Additional context
One can trivially work around the problem by passing an empty
config
argument when loading the pretrained model:so maybe this is just a matter of updating
README.md
.Still, I'd argue that the cleaner fix would be to pass the entire
**model_kwargs
to the class constructor, thusThat would also coincide with how HuggingFace themselves do it in https://github.com/huggingface/huggingface_hub/blob/855ee997202e003b80bafd3c02dac65146c89cc4/src/huggingface_hub/hub_mixin.py#L633.