Closed ThyrixYang closed 1 year ago
@ThyrixYang this one is a chunked version with lookback, without the need for CUDA
longformer is from wayback, and i believe is local attention mixed with dedicated global attention lanes
@lucidrains Thanks for your explanation.
Are there any benchmarks for this library's performance (memory, speed, accuracy)? Is this library a direct implementation of the Longformer method? The Longforer has a chunk version and a cuda version, which one does this lib implement?
Would you please provide more details about the existing usage of this library?