-
### Code of Conduct
- [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct)
### Search before asking
- [X] I have searched in the [issues](http…
-
We want to add support for top 1-3 use cases for diagnostic reason for GPU slowness in Profiler tool.
1. Highest stage time (descending order) contributors -- like spill amount per stage, skew (input…
-
```
____________________________________________________________________________ ERROR collecting test/test_align.py ____________________________________________________________________________
test…
-
### Issue Checklist
- [X] I have properly named my issue
- [ ] I have checked the Issues/Discussions pages to see if my issue has already been reported
### Platform
Itch.io (Downloadable Build) - W…
-
### Overview
This issue is for tracking design work related to [User Story #635 ]. [COORDINATOR can view submitted applications for Hosts & guests].
As the new PM, research the coordinator role and…
-
-
-
-
按理说ZeRO的stage 3不应该是占用GPU最少但是速度最慢的吗,但是我测试下来发现stage为3消耗的GPU最多,速度最慢。
测试模型:BLOOM 560M & BLOOM 1.1B, training batch 为1
数据就是BELLE发布的belleMath.json。
GPU: TITAN RTX 24GB *2
测试结果:
ZeRO Stage=1:
…
-