Separation of model config and weights

kklemon commented 1 year ago

I am writing a high-level library for protein embedding extraction that supports various models, including ESM. For testing purposes, I would like to avoid downloading the full model weights as part of my CI pipeline and instead uninitialized ones derived from a specific model configuration.

For Hugging Face models, this is easy to implement since the model configuration and weights can be downloaded separately. However, the ESM models are represented by a single .pt file that contains both the model configuration and the weights. This makes it impossible to instantiate a particular model without downloading the weights, which of course can cause dozens of gigabytes of traffic.

Would it therefore be possible to either separate the model configurations and the weights, i.e. provide two separate files, or alternatively, commit the model configs to git?

tomsercu commented 1 year ago

hi @kklemon we don't really plan to change the way our models and configs are packaged up on our side. It would be easy to make a script that loads up the XYZ.pt file, just pull out the config and create a XYZ_config.pt file. That config.pt file could be hosted probably in your own github repo? Let me know if that solution would work.

kklemon commented 1 year ago

I think that would be the best solution at the moment, even if it might require regular retrieval of models and configurations. But I can well understand that you can't introduce such a big change for a special use case like mine.

facebookresearch / esm

Separation of model config and weights #395