-
Hi,
Thanks for your great work. There will be an error when running the truck scene like this:
`
/opt/conda/conda-bld/pytorch_1659484801627/work/aten/src/ATen/native/cuda/IndexKernel.cu:91: operato…
-
Stable diffusion colab does not open, how can I solve it?
gives the following error
Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/transformers/utils/import_…
-
only occur when using 8 bit adam
with FSDP1 i run into:
FSDP config
param_dtype: bf16
reduce_dtype: fp32
```
Traceback (most recent call last):
File "", line 198, in _run_mo…
-
When utilizing multi-gpu training with DDP, e.g. with a `SemiSupervisedDataLoader` and subsequently `ConcatDataLoader`, an error is raised because `distributed_sampler` keyword argument required for…
-
Hello during training I have following error:
ERROR Was not able to read git information, trying to continue without.
ERROR Could not log req: stderr not empty
Traceback (most recent call last):
…
-
Hi, I tried to play to test hyperparameter sweep with PyTorch Lightning with a minimum example (1-D regression with 1 layer hidden layer). The sweep appears to start but I keep getting error messages …
-
### Bug description
In `FSDPStrategy.save_checkpoint`, the `filepath` variable is transformed via
https://github.com/Lightning-AI/pytorch-lightning/blob/3627c5bfac704d44c0d055a2cdf6f3f9e3f9e8c1/src/…
-
Hello! Really appreciate your outstanding work!
However, when I try to retrain `geo2mat`, I encounter this problem:
```python
Time stamp: #5 save blend and glbs
524 0.06638479232788086 0.0918…
-
### Bug description
I am training a sample model which works on multiple GPUs as long as these are across nodes. But as soon as I allocate more than one GPU on a node it returns `[rank7]: torch.dist…
-
### 🐛 Describe the bug
Hi,
I'm trying to training F-RCNN based on coco dataset on my images. Image size is 512X512
I've tested dataloader separately and it works and prints the batch images and …