-
[P1928R0](https://wg21.link/p1928r0) Merge data-parallel types from the Parallelism TS 2 (Matthias Kretz)
-
Can we implement the expert parallel strategy for MoE to fully exploit the sparse activation property? Ideally, MoE should only use compute at the order of active parameters, but the current implement…
-
During continuing training MoE models(loading existing ckpt), at some steps, assert errors occurred as follows:
"found NaN in local grad norm in backward pass before data-parallel communication colle…
-
While running the parallelized data processing routine (process_data_parallel function) in the data_processing.py script, an unhandled exception occurs, halting the entire operation. Error handling me…
-
### Do you need to file an issue?
- [X] I have searched the existing issues and this bug is not already filed.
- [X] My model is hosted on OpenAI or Azure. If not, please look at the "model providers…
-
Traceback (most recent call last):
File "tctrack_original/tools/train_tctrackpp.py", line 303, in
main()
File "tctrack_original/tools/train_tctrackpp.py", line 298, in main
train(trai…
-
```
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 80/80 [00:00
-
We've run into this when using Xarray together with Dask. The default way of calling this is like this at the moment:
```
file_list = []
model = "ACCESS-CM2"
variable = "hurs"
data_dir = f"s3:/…
-
Using torch.utils.data.DataLoader, the loading seems to create dead lock. May I ask if this is expected?
-
As [here](https://github.com/I2PC/xmipp/pull/639/files)
List of programs that use metadataDB:
- /src/xmipp/applications/programs (copy)/mpi_volume_homogenizer/mpi_volume_homogenizer.cpp
- /src/…