huggingface / huggingface_hub

The official Python client for the Huggingface Hub.
https://huggingface.co/docs/huggingface_hub
Apache License 2.0
1.83k stars 471 forks source link

better logs for ModelHubMixin unserialized parameters #2297

Closed not-lain closed 1 month ago

not-lain commented 1 month ago

maybe add a warnings when instantiating a new model and when there have been problems with the serialization.

tagging @Wauplin for better visibility

Wauplin commented 1 month ago

Hi @not-lain, thanks for the suggestion.

input param of an unsupported type => tell user to use the coders param and a link to docs

I would prefer not to. For ModelHubMixin, we should differenciate 2 types of users: the library maintainers that will integrate the mixin and the final users that will not know what to do with the warning. Since this warning would only be addressed to library maintainers, I don't think it should be triggered. Maintainers integrating the mixin should test their integration before releasing (so hopefully realize their parameters are not serialized)

when param exists in both config param as a key, and the init method has the same param name

Has this ever happened in a real use case? Same that for my point above, I think it'll be confusing for end users to get such a warning as they will most likely don't understand what to do (and won't be able to do somehting anyway)

not-lain commented 1 month ago

Hi @Wauplin

Wauplin commented 1 month ago

example : remi's repo on HF

What script/library is supposed to open this repo? And what type was not handled in the process?

not-lain commented 1 month ago

What script/library is supposed to open this repo? And what type was not handled in the process?

this one: https://github.com/huggingface/lerobot/blob/d585c73f9f73ed3313f9e12969e60038389e6f59/lerobot/common/logger.py#L153

also, this one will give you a better look at the repo : https://github.com/huggingface/lerobot/blob/042e1939955a1eba771f0f0a31532d87b11a9993/lerobot/common/policies/factory.py#L82

for full transparency, they are using the ModelHubMixin only with the policies since they are using omegaconf, I believe they can optimize how they are loading and pushing their models by injecting the _cfg parameter into the policy init parameters making loading and pushing models is being done seamlessly with huggingface.

TLDR; they are using modelhubmixin with snapshotdownload, they are also working around the unsupported input type by detaching the OmegaConf parameters from the model.

In hindsight, this beats the purpose of using ModelHubMixin, but we are not here to enforce a specific architecture on them. well, this issue is an optional feature, but it would be nice to let developers know that their init parameters were not added properly to the _hub_mixin_config because of their type.

not-lain commented 1 month ago

I would prefer not to. For ModelHubMixin, we should differenciate 2 types of users: the library maintainers that will integrate the mixin and the final users that will not know what to do with the warning.

reading this one I would agree with you, so i'm closing this one