huggingface / huggingface_hub

The official Python client for the Huggingface Hub.
https://huggingface.co/docs/huggingface_hub
Apache License 2.0
2.14k stars 559 forks source link

push_to_hub_keras should support multiple models #533

Open merveenoyan opened 2 years ago

merveenoyan commented 2 years ago

It's a minor UX improvement for push_to_hub_keras.

Some applications like RL or non-transformers encoder-decoder networks (CNN encoder RNN decoder) have multiple models instead of one model.

We could pass a list to model, get the model names and save those models individually under different folders and push. (check if list or a model with isinstance first)

An alternative is git pull everytime we push with push_to_hub_keras (which is cumbersome in a notebook) and call push_to_hub_keras repeatedly for every model.

cc: @osanseviero

merveenoyan commented 2 years ago

I'm currently working on this for GAN sprint in which you'd have to push two models to one repository. (leaving notes to myself) First argument of push_to_hub_keras() should be a model or a list of model objects, if list, it will check all the models to see if they're built, and also create a custom model card for both models (maybe append? which is not that aesthetic). There would be two folders with separate models (that can be named with the order of the list, e.g. model_1 which I find sloppy), and from_pretrained_keras() should support loading them separately. (or like model_1, model_2 = from_pretrained_keras(...)) Looks like

> model_1 
>> model_1 files (including plot)
> model_2
>> model_2 files (including plot)
> README.md

The widget will never work anyway if the pipeline is not overridden so it's fine imo.

merveenoyan commented 2 years ago

I think following is better so people can name the folders (when we do list it's more like model_0, model_1 etc)

push_to_hub_keras(
    model = {"generator":generator, "discriminator":discriminator},
    repo_path_or_name="merve/multiple-models-...",
...)

Which looks like this:

Ekran Resmi 2022-03-03 17 23 00

However the readme looks ugly, I append them. You can see the readme's repo for example pushed multiple model.

If no one has strong design opinions (I'd rather have you have them 😅) I'll fix the plot path in the model card and write tests. cc: @nateraw

osanseviero commented 2 years ago

I feel only the following sections are needed:

  1. Model desc (for full thing)
  2. Intended uses and limitations
  3. Training and eval data
  4. Train procedure for model A
  5. Model Plot for model A
  6. Training procedure for model B
  7. Model Plot for model B

Maybe before 4, we can have a section saying "this repository consists of X models". Then before 4 and before 6, have a heading with "Further information for model A|B". WDYT? We could do this by breaking the readme function a bit I feel.

merveenoyan commented 2 years ago

Makes sense, I will change the card generation procedure. Thanks a lot for the input.

nateraw commented 2 years ago

Sure, I like this idea 😄 . Would be very helpful especially in the case of GANs. I'd really be curious to know if other folks want this feature.

As for model card, I think we can either do what @osanseviero is describing, or just leave it as-is. I'm guessing this would only be used when models are trained together. People probably shouldn't be dumping unrelated models in the same repo and we don't want to encourage that.

osanseviero commented 2 years ago

Yes, good point @nateraw. TBH since this is not a feature for which I expect significantly high usage, I think we can go as-is and iterate if we see significant usage

Wauplin commented 2 years ago

@merveenoyan (cc @osanseviero @nateraw) I'm jumping into the discussion, hope it's not outdated 😬

Since push_to_hub_keras is now http-based (https://github.com/huggingface/huggingface_hub/pull/847 is merged) and do not require to download the full repository anymore, calling multiple times push_to_hub_keras (once per model) is not so cumbersome anymore, right ? Or would it still be a nice feature to have ? I saw that https://github.com/huggingface/huggingface_hub/pull/809 has been abandoned. Should I just close this issue ?

merveenoyan commented 1 year ago

@Wauplin we decided not to go with it, it wasn't cumbersome anyway it was just a matter of pairing two models that were trained according to each other. assume you run experiments with a pipeline that has CNN model and an RNN model, it's a little cumbersome if someone is serializing these models in two different repositories, as in, you do different experiments, so your monday CNN should be with monday RNN and from the repository you should be able to tell that. so you have multiple repositories. these two belong together:

so when user is loading these models, they'll load them separately. My solution have one repository that has two models:

So when someone loads it, they can load it with one line of code. I think it's good if people were to version using revisions, e.g. have monday revision for both model repositories and tuesday revision in parallel. (I don't know if I'm clear, you can stop me if I am 😅) So I don't know if we should close this discussion or not.

Also pinging @sayakpaul since we discussed about this for dreambooth event (a case where there's two models that belong together).

Wauplin commented 1 year ago

Hi @merveenoyan , thanks for reopening the topic! Explanation are clear to me :smile: It reminds me this discussion initiated in diffusers a while ago. Their problematic is a bit different but still similar. Diffusers pipeline can be made of different components (scheduler, denoiser, decoder) that could in theory be taken from different repos/version. So they also need this possibility to load/push only components instead of the entire pipeline.

That said, not sure what should live in huggingface_hub and what should be specific to integrations. In the push_to_hub_keras situation, I definitely agree that a single repo merve/CNN_RNN_I_trained make sense and use git tags (create_tag) to pin revisions monday, tuesday,... (or most likely experiment_1, experiment_2,...)