bigcode-project / transformers

Apache License 2.0
26 stars 8 forks source link

Fork the model into GPTBigCode #6

Closed jlamypoirier closed 1 year ago

jlamypoirier commented 1 year ago

Create a new model identical to GPT2. Actual changes will be added later to track changes more easily. Based on https://github.com/huggingface/transformers/pull/21253, without the doc and model changes. Also using the name GPTBigCode instead because it does a lot more than MQA, but the name can be changed later.

Tested with https://github.com/bigcode-project/bigcode-inference-benchmark/pull/14. Lots of things probably won't work until updated (tests, onnx, pretrained models, etc.)

jlamypoirier commented 1 year ago

Merged to speed things up, it doesn't affect the rest of transformers anyway