-
For DDP, rank 0's weight is synced to all ranks before forward. For FSDP, it would be nice to have a way to do this so that different weights from different ranks will be consistent before the forward…
-
### Description
Currently, ``overload`` requires at least 2 ranks along each dimension. Would be good if it also works with a single rank
### Questions:
- [ ] what should happen in this case?…
-
My KGS rank is 3k? and my OGS rank is 1k.
![image](https://user-images.githubusercontent.com/30604526/231831528-ccf1c616-e5f8-4ab0-ac87-fa38eae10649.png)
-
```
Like utime + "autorank" works @ ULX.
1. Every 5 hour (for instance), you get ranked up to a new level.
(The real admin ranks excluded ofc)
Would have to be able to customize it obviously, to fit …
-
🚀 Feature
To better debug the stuck ranks that have not reached `_store_based_barrier()`. We can think about recording the ranks that have not called into the function and logging those.
Currently…
-
I ran into an interesting result when comparing linear convergence and solution average for different numbers of MPI ranks in Albany Land Ice:
```
greenland, 1-10km, ML preconditioner, 320 ranks:
T…
-
This issue is intended to track the OpenMPI seg fault problem discussed last week. When running SNAP with OpenMPI on KNL using 1024 ranks the application seg faults during initialization. This problem…
-
It appears that parallel simulations, with or without shared memory, are not working when using NonUniformRectCart.
I'm seeing errors like
```
Fatal error in PMPI_Group_incl: Invalid rank, error …
-
### Description & Motivation
https://github.com/pytorch/pytorch/pull/104810 adds the recommendation that the `save` APIs should be called in a single node (`shard_group`).
https://github.com/pyt…
-
When creating a new profile, the editor can manually construct the hierarchy if the name they entered is not matched. However, the hierarchy requires a rank for each level, and the list of available r…
m-r-c updated
3 years ago