-
### 🐛 Describe the bug
`parallelize_module` failed with `nn.Transformer` and the `PairwiseParallel` plan, which is unexpected according to [the doc](https://pytorch.org/docs/stable/distributed.tens…
-
### 🐛 Describe the bug
`_copy` ops all accept an `out=` parameter. The usual behavior is this:
1. If the `out=` parameter is the correct size, the result is copied into it.
2. If the `out=` par…
-
I am installing nnUNet on a Docker container with CUDA driver 11.4. I first install the most recent torch compatible with the driver and it runs ok (11.8 should be compatible with 11.4):
```shell
…
-
### 🐛 Describe the bug
I am following this example code to save and load optimizer states using load_sharded_optimizer_state_dict: https://github.com/pytorch/pytorch/blob/c75e064dd6a2f800476bc84d4f…
-
**Describe the Bug**
Apex installation fails.
**Minimal Steps/Code to Reproduce the Bug**
Follow the recommended installation steps inhttps://github.com/NVIDIA/apex#linux .
```bash
git clone h…
-
Hi Mohit! Thank you first for the good code.
I have struggled to use your environment but it seems your environment must require a machine with a screen, since I always get the following error in my …
-
### 🐛 Describe the bug
Hello, I found that a standard DataLoader takes unreasonably long to construct itself and to load the first batch if there is a filed in a dataset that takes long to pickle (…
-
### 🐛 Describe the bug
This issue is separated from https://github.com/pytorch/pytorch/issues/104952, verified on 2023-07-06 nightly: https://github.com/pytorch/pytorch/commit/13763f58ad86fadf49ef796…
-
### Your current environment
```
PyTorch version: 2.3.1+cu121
Is debug build: False
CUDA used to build PyTorch: 12.1
ROCM used to build PyTorch: N/A
OS: Ubuntu 20.04.6 LTS (x86_64)
GCC versio…
-
您好,我将zero数据处理后约7000w,存储到lmdb文件,但是lmdb仅能生成一个data.mdb,存放在一个device上,导致我训练模型时数据读取io瓶颈,CPU与GPU利用不充分,请问有遇到这个问题吗?