issues
search
jzhang38
/
EasyContext
Memory optimization and training recipes to extrapolate language models' context length to 1 million tokens, with minimal hardware.
Apache License 2.0
529
stars
33
forks
source link
Modify interface
#5
Closed
jzhang38
closed
3 months ago
jzhang38
commented
3 months ago
Modify interface (see the Usage section in README)
Rectify a small bug in dist_flash_attn. Now dist_flash_attn and zigzag_ring_attn produce the same loss.
jzhang38
commented
3 months ago
Same loss.