-
### Description
If you try to integrate this `EmbeddedChat` package into a React application created using `create-react-app`, as soon as you log in, you will encounter a warning or error similar t…
-
### 📚 The doc issue
I found these environment variables in the PyTorch code. Is there any document that describes the application scenarios?
TORCH_NCCL_BLOCKING_WAIT
TORCH_NCCL_ASYNC_ERROR_HANDLING…
-
## 🐛 Bug
When utilizing the **StreamingDataset** to read data directly from AWS S3 with Distributed Data Parallel (DDP), the following warning message is displayed:
```
lib/python3.10/site-pack…
-
devices=-1, in train.py with a machine with 2 more more GPU's causes these errors:
RuntimeError: It looks like your LightningModule has parameters that were not used in producing the loss returned by…
-
### System Info
- deepspeed version: 0.14.2
- `transformers` version: 4.42.3
- Platform: Linux-5.10.134-13.al8.x86_64-x86_64-with-glibc2.30
- Python version: 3.11.9
- Huggingface_hub version: 0…
-
### Search before asking
- [X] I have searched the YOLOv8 [issues](https://github.com/ultralytics/ultralytics/issues) and found no similar bug report.
### YOLOv8 Component
_No response_
### Bug
…
-
### Bug description
The documentation (https://lightning.ai/docs/fabric/stable/api/fabric_args.html#strategy) states that to run DDP with "find_unused_parameters=True" we can use the strategy strin…
-
您好,在使用DDP训练时,程序会卡住不动,且GPU占用率100%,请问这个是数据的问题嘛【此处是使用自己的数据】?
-
- The issue has been described here-> https://github.com/DalgoT4D/DDP_backend/issues/738
-
Hi, thanks for the incredible library! We've been using pytorch metric learning for a task which requires around 300,000 images belonging to a lot of classes. We're quite new to metric learning and DD…