Closed hanwen-sun closed 2 months ago
Code got formatted by CI. Please request CI again if you still want to have this PR merged. If the PR is from a forked repo, please download the patch files from the GitHub Actions web page and apply them locally.
修改clip_grad中tensor parallel的通信策略:
L2 norm计算修改:
TODO: