-
### Benefit
UGRC will benefit immensely from a mechanism that allows us to collaborate on datasets with our authoritative stewards. For now, this mechanism is our Enterprise/Portal.
We made some pro…
-
I am using the `ESMF_MeshCreateDual()` method to construct the dual mesh (swapping nodes and elements) of the NEPTUNE mesh. The original mesh is constructed with nodal and element coordinates -- as i…
-
### Describe the bug
I run the training but get this error
### Reproduction
Run accelerate config
```
compute_environment: LOCAL_MACHINE
debug: false
distributed_type: FSDP
downcast_bf16: 'n…
kopyl updated
2 weeks ago
-
I am attempting to train a model using the tools/dist_train.sh script as documented, but I'm encountering several errors and warnings during execution. Below is the command I used and the correspondin…
-
### Is your feature request related to a problem or challenge?
Suppose you are building a distributed query engine on top of DataFusion and you want to run a query like
```
SELECT facts.fact_value,…
-
### 🐛 Describe the bug
Torch does not allow 2D FSDP + TP to get FULL_STATE_DICT. However, if I remove checks here:
https://github.com/pytorch/pytorch/blob/3f62b05d31d4b29d60874b05adc0e5aedbad3722/to…
-
"D:\GPT-SoVITS-v2-240821\runtime\python.exe" GPT_SoVITS/s2_train.py --config "D:\GPT-SoVITS-v2-240821\TEMP/tmp_s2.json"
[E C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\torch\csrc\d…
-
Thank you very much for sharing the XoFTR code. I have tried to apply this method in my work. But I observed a phenomenon: the matched points are concentrated in the center of the image, and there are…
-
### System Info
Env: pytorch 2.5 nightly, CUDA 12.4, python 3.10, NVIDIA Hopper GPU, 2 GPU, NCCL 2.21.5(?)
### Information
- [X] The official example scripts
- [ ] My own modified scripts
### 🐛…
-
I tried an sccache distributed build today, and the build failed with errors like this:
```
1:38.92 /home/botond/dev/mozilla/central/js/src/jit/MIR.h:8337:218: error: result of comparison of unsi…