FlagOpen / FlagScale

FlagScale is a large model toolkit based on open-sourced projects.
Other
132 stars 40 forks source link

[New Feature]Support ulysses sequence parallism #187

Closed heavyrain-lzy closed 1 month ago

heavyrain-lzy commented 1 month ago

Add deepseed-ulysses sequence parallelism. We can enable both context-parallel and ulysses-parallel or one of them to train long sequence model.

you can set ulysses_sp_parallel_size in config file.

system:
  tensor_model_parallel_size: 1
  pipeline_model_parallel_size: 1
  ulysses_sp_parallel_size: 2

The legacy related PR: https://github.com/FlagOpen/FlagScale/pull/156

referece: Jiarui Fang and Shangchun Zhao. 2024. USP: A Unified Sequence Parallelism Approach for Long Context Generative AI. https://doi.org/10.48550/arXiv.2405.07719