pytorch / torchtitan

A native PyTorch Library for large model training
BSD 3-Clause "New" or "Revised" License
2.29k stars 170 forks source link

[wip] differentiate Rstd vs rstd #294

Closed lessw2020 closed 1 month ago

lessw2020 commented 5 months ago

update main Rstd tensor name vs rstd to see which one registering fusedRMS as an op is really concerned with.