This PR follows up on #332, adding support for bidirectional conversions of model configs between Curated Transformers and Hugging Face. This is facilitated by the following classes:
HFConfigKey - Descriptor for a HF model config key. Defines how its mapped to the CT config.
HFConfigKeyDefault - Wrapper around a default value for a HF config key that allows for the generic handling of optional keys.
HFSpecificConfig - A set of hardcoded keys that are required to be part of every HF model config. Overridden on a per-model basis and merged into the final config dictionary.
CommonHFKeys and CommonCuratedToHFConverters - Shared config key descriptors and conversion functions.
The FromHFHub mix has been expanded to provide methods for the conversion of the configs.
All currently supported models except Falcon support bidirectional config conversions implicitly. The conversion of the Falcon model config is more complicated as we support loading from two different model implementations - RefinedWebModel and Falcon. That and the complicated new_decoder_architecture situation makes it difficult to allow the full range of conversion. Since the latter implementation is now in the mainline transformers branch, we'll do the following:
Conversion from both RWM and Falcon HF architectures is fully supported.
The CT Falcon config will only be converted to the mainline Falcon HF config/architecture.
Checklist
[x] I confirm that I have the right to submit this contribution under the project's MIT license.
Description
This PR follows up on #332, adding support for bidirectional conversions of model configs between Curated Transformers and Hugging Face. This is facilitated by the following classes:
HFConfigKey
- Descriptor for a HF model config key. Defines how its mapped to the CT config.HFConfigKeyDefault
- Wrapper around a default value for a HF config key that allows for the generic handling of optional keys.HFSpecificConfig
- A set of hardcoded keys that are required to be part of every HF model config. Overridden on a per-model basis and merged into the final config dictionary.CommonHFKeys
andCommonCuratedToHFConverters
- Shared config key descriptors and conversion functions.The
FromHFHub
mix has been expanded to provide methods for the conversion of the configs.All currently supported models except Falcon support bidirectional config conversions implicitly. The conversion of the Falcon model config is more complicated as we support loading from two different model implementations -
RefinedWebModel
andFalcon
. That and the complicatednew_decoder_architecture
situation makes it difficult to allow the full range of conversion. Since the latter implementation is now in the mainlinetransformers
branch, we'll do the following:Checklist