bigcode-project / transformers

Apache License 2.0
26 stars 8 forks source link

Multi-query attention #4

Closed jlamypoirier closed 1 year ago

bigximik commented 1 year ago

@jlamypoirier, I can merge this PR in the morning with anything you will add today and also to merge @lvwerra branch from Hugging Face repo and to move all to gpt2mqa model from here? Will it be OK?

jlamypoirier commented 1 year ago

@jlamypoirier, I can merge this PR in the morning with anything you will add today and also to merge @lvwerra branch from Hugging Face repo and to move all to gpt2mqa model from here? Will it be OK?

I think we should wait a bit, there seems to be some bugs still. [Edit: bugs are gone, ready to review/merge]

jlamypoirier commented 1 year ago

Merged to speed things up, it doesn't affect the rest of transformers anyway