berlino / gated_linear_attention

MIT License
97 stars 2 forks source link

Is it possible to extend the code to accept padding masks? #3

Closed KatarinaYuan closed 8 months ago

KatarinaYuan commented 10 months ago

Hi, thank you for this interesting work. Wondering if it's possible to extend the currently public code to accept padding masks as augments? Thank you for the help!

sustcsonglin commented 10 months ago

Yep! I can implement this within a few day.

KatarinaYuan commented 10 months ago

Thank you so much for helping! Looking forward to that!

sustcsonglin commented 10 months ago

Hello, i wonder what kind of padding mask do you need? The model only supports causal linear attention