volcengine / veScale

A PyTorch Native LLM Training Framework
http://vescale.xyz
Apache License 2.0
553 stars 26 forks source link

[Example] add an example of running open Mixtral 8x7B in 4D using veScale #24

Closed Vremold closed 4 months ago

Vremold commented 4 months ago

This PR adds an 4D parallelism example of using veScale to run a Mixtral 8x7B model that is directly imported from HuggingFace without any model code modifications.

At the same time, we also develop a debug utility of printing logs and reorganize some code structures in this PR.