volcengine / veScale

A PyTorch Native LLM Training Framework
http://vescale.xyz
Apache License 2.0
553 stars 26 forks source link

[PyTorch] Add patches for distributed randomness #29

Closed lichen225 closed 4 months ago

lichen225 commented 4 months ago

In this PR, we update the pytorch patch for including the DTensor sharding info in Cuda RNG states.

lichen225 commented 4 months ago

@lichen225 @JsBlueCat I find many unchanged code but high-lighten in green?

I think Github has these syntax highlights for patch files by default.