AI21Labs / Parallel-Context-Windows

Apache License 2.0
101 stars 12 forks source link

GPT-J? #1

Open okhat opened 1 year ago

okhat commented 1 year ago

Thank you for the great work and release! Will this work with GPT-J (modulo minor edits to the code)?

inbalmai21 commented 1 year ago

Hi Omar, thanks for your interest!

GPT-J models use Rotary Position Embedding (RoPE), and from what I've seen, the code looks quite different and requires slightly different implementation. It would be easier to adapt the code for OPT models.

Main changes needed (for OPT):

If you would like to implement these changes, we would be happy to review your PR, or alternatively, you may wait until we prioritize this request.

Hope this helps :smiley:

amurtadha commented 1 year ago

hi @inbalmai21 I have already wrote the code for OPT and also fixed a minor issue with GPT2 where B >5. I would like to contribute to this repo.

Thank you