The conversion and pushing scripts from bigcode/Megatron-LM https://github.com/bigcode-project/Megatron-LM/tree/multi-query-attention/tools/hf_transformers, updated and adapted to the new codebase. Supports both standalone (custom) models, packaged with the code and usable with hf transformers (it requires the upcoming version 4.27), or a standard GPTBigCodeLMHeadModel model that needs bigcode transformers.
Moved the script here to to simplify tracking and syncing with model.
The conversion and pushing scripts from bigcode/Megatron-LM https://github.com/bigcode-project/Megatron-LM/tree/multi-query-attention/tools/hf_transformers, updated and adapted to the new codebase. Supports both standalone (custom) models, packaged with the code and usable with hf transformers (it requires the upcoming version 4.27), or a standard
GPTBigCodeLMHeadModel
model that needs bigcode transformers.Moved the script here to to simplify tracking and syncing with model.
Tested to work on the santacoder models, pushed to https://huggingface.co/bigcode/santacoder-fast-inference