Closed FFFiend closed 1 year ago
In principle you should be able to apply gist masking to basically any decoder or encoder-decoder Transformer language model (see the paper for details), but you're right that this will require training your own model, and in most cases, modifying the implementation of the LM to support custom attention masks.
You can take a look at the implementations of gist_llama
and gist_t5
to
see how such masking is implemented, and diff my implementation with the
reference Transformers implementation, but right now unfortunately I don't
have plans to support other LMs. Sorry!
On Sat, Jul 22, 2023 at 10:15 PM Owais Zahid @.***> wrote:
Basically title, or excuse my ignorance if this is not possible. Right now it seems like only FLAN-T5 and LLama-7b are supported.
— Reply to this email directly, view it on GitHub https://github.com/jayelm/gisting/issues/12, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABPUV33GORUZPIOGZPP3QX3XRSXORANCNFSM6AAAAAA2UI3B2Y . You are receiving this because you are subscribed to this thread.Message ID: @.***>
Thanks Jesse 😄
Basically title, or excuse my ignorance if this is not possible. Right now it seems like only FLAN-T5 and LLama-7b are supported.