[Example] add an example of finetuning nanoGPT in 4D using veScale

volcengine / veScale

A PyTorch Native LLM Training Framework

http://vescale.xyz

Apache License 2.0

553 stars 26 forks source link

[Example] add an example of finetuning nanoGPT in 4D using veScale #16

Closed lichen225 closed 5 months ago

lichen225 commented 5 months ago

This PR adds an example of finetuning a GPT2 model using veScale API in 4D parallelism: Data, Tensor, Sequence, and Optimizer Parallelism. There are near-zero changes in the model code. In addition, this PR also improves factory methods for DTensors and simplifies DModule APIs.