-
Lately we are observing hundreds of warnings about the dashboard ports being occupied. While I understand that we occasionally may have an overlap, I would expect every test to close the HTTP server p…
-
Could you please specify the license explicitly in the pf_ring.h file?
The absence of explicit declaration creates ambiguity when linking with the user-space pfring library.
The library itself is …
-
Related to https://github.com/dotnet/runtime/issues/45496
[W3C Baggage](https://www.w3.org/TR/baggage/) defines a propagation format for arbitrary application-specific properties.
Despite being dr…
-
### What is the problem the feature request solves?
# Rationale
The Arrow ecosystem lacks standard database interfaces built around Arrow data, especially for efficiently fetching large datasets (i.…
-
### 🚀 The feature, motivation and pitch
Looking at https://pytorch.org/docs/stable/elastic/errors.html, I don't see any way to avoid restarts when a non-retriable user error has happened. It is often…
-
``` bash
$ sh ./single_run.sh >> "error.txt"
./single_run.sh: 4: source: not found
/home/dse316/miniconda3/envs/grp_007/lib/python3.10/site-packages/torch/distributed/launch.py:178: FutureWarning…
-
### Issue type
Crash or freeze
### Description with steps to reproduce
1. Open attached file
2. Select a whole row with shift
3. Click on any note multiple times on the top bar to change row note…
-
Hi
I am encountering an error when using 2 GPUs for training YOLOv10n. Here is the error:
```
(yolov10) C:\Users\muh\yolov10>yolo detect train data=coco.yaml model=yolov10n.yaml epochs=500 batc…
-
[rank0]: File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
[rank0]: return self._call_impl(*args, **kwargs)
[rank0]: File "/usr/loca…
-
Thanks for amazing work to accelerate distributed training. When I use 'deepspeed train.py' to start megatron-lm train task, I get this log
![image](https://github.com/user-attachments/assets/61e6646…