explosion / curated-transformers

🤖 A PyTorch library of curated Transformer models and their composable components
MIT License
864 stars 34 forks source link

Add support for converting Curated Transformer configs to Hugging Face compatible configs #333

Closed shadeMe closed 1 year ago

shadeMe commented 1 year ago

Description

This PR follows up on #332, adding support for bidirectional conversions of model configs between Curated Transformers and Hugging Face. This is facilitated by the following classes:

The FromHFHub mix has been expanded to provide methods for the conversion of the configs.

All currently supported models except Falcon support bidirectional config conversions implicitly. The conversion of the Falcon model config is more complicated as we support loading from two different model implementations - RefinedWebModel and Falcon. That and the complicated new_decoder_architecture situation makes it difficult to allow the full range of conversion. Since the latter implementation is now in the mainline transformers branch, we'll do the following:

Checklist