zhiyuanhubj / LongRecipe

LongRecipe: Recipe for Efficient Long Context Generalization in Large Language Models
https://arxiv.org/abs/2409.00509
57 stars 4 forks source link

upgrade yunchang (feifeibear/long-context-attentio) to version 0.3.0 #2

Open feifeibear opened 1 week ago

feifeibear commented 1 week ago

Thank you for contributing to LongRecipe, I noticed the recent release of the technical report. We have been following the EasyContext project for a long time, and both LongRecipe and EasyContext have all used the yunchang to implement Ulysses sequence parallelism.

yunchang has recently undergone a version upgrade, now at 0.3.0. I noticed you are still using version 0.1. The new version supports flash_attn >= 2.6.0 and works for NVIDIA Tesla and Volta GPUs.

I can help you upgrade the yunchang version together.

feifeibear commented 1 week ago

Additionally, I strongly recommend using yunchang's USP, which is a hybrid parallelism approach combining ring and Ulysses. This not only achieves higher training TFlops but also simplifies your code, requiring only a single interface, LongContextAttention, to apply either Ulysses, ring (we also used the zilin's implementation), or a hybrid of both.

USP: A Unified Sequence Parallelism Approach for Long Context Generative AI https://arxiv.org/abs/2405.07719

zhiyuanhubj commented 1 week ago

Hello, thank you for your appreciation of our work. We also noticed USP and tried to utilize it in our experiments. However, we consistently encountered library compatibility issues and switched to Ulysses or Ring. We would be happy to upgrade Yunchang's version and use USP for parallelism. If possible, would you mind submitting a Pull Request? We will further validate it and merge it into our current repo.

feifeibear commented 1 week ago

Hello, thank you for your appreciation of our work. We also noticed USP and tried to utilize it in our experiments. However, we consistently encountered library compatibility issues and switched to Ulysses or Ring. We would be happy to upgrade Yunchang's version and use USP for parallelism. If possible, would you mind submitting a Pull Request? We will further validate it and merge it into our current repo.

I have proposed a PR to EasyContext https://github.com/jzhang38/EasyContext/pull/50