bigcode-project / transformers

Apache License 2.0
26 stars 8 forks source link

Megatron conversion script #8

Closed jlamypoirier closed 1 year ago

jlamypoirier commented 1 year ago

The conversion and pushing scripts from bigcode/Megatron-LM https://github.com/bigcode-project/Megatron-LM/tree/multi-query-attention/tools/hf_transformers, updated and adapted to the new codebase. Supports both standalone (custom) models, packaged with the code and usable with hf transformers (it requires the upcoming version 4.27), or a standard GPTBigCodeLMHeadModel model that needs bigcode transformers.

Moved the script here to to simplify tracking and syncing with model.

Tested to work on the santacoder models, pushed to https://huggingface.co/bigcode/santacoder-fast-inference