issues
search
feifeibear
/
long-context-attention
USP: Unified (a.k.a. Hybrid, 2D) Sequence Parallel Attention for Long Context Transformers Model Training and Inference
Apache License 2.0
313
stars
19
forks
source link
add loss curve
#58
Closed
feifeibear
closed
3 months ago