pytorch-labs / attention-gym

Helpful tools and examples for working with flex-attention
BSD 3-Clause "New" or "Revised" License
490 stars 24 forks source link

Add Global + Sliding Window #3

Open drisspg opened 3 months ago

drisspg commented 3 months ago

Summary

Paper: arxiv.org/pdf/2004.05150 LongFormer