-
**Describe the bug**
A clear and concise description of what the bug is.
1. I train mixtral 7Bx8 model , tain 270 step, it will be hang , after 30m , NCCL timeout ,process will be killed
Inva…
-
First of all, thanks for this awesome work!
I have a Moes T6E. Just updated the App to V2.2.0.
Now when I select "tools" the app crashes, no matter if "active" is ON or OFF.
I could not find …
-
### Your current environment
```text
The output of `python collect_env.py`
```
### 🐛 Describe the bug
We received quite a lot report about "Watchdog caught collective operation timeout", which …
-
## 日期 Date
1 Mar 2019
## 地點 Location
- Hong Kong
## 場地 Venue
Station for Open Cultures in Eaton Hotel (380 Nathan Road)
## 題目 Sessions
- in @ , with one-line to describe
-
Hello!
I am stuck since days and weeks in trying to flash on of my Moes GALWW-N thermostat which is basically a BAC-002ALWW (https://www.moeshouse.com/collections/smart-thermostat/products/wifi-cen…
-
**Bug description**
Context: Running inference on a multi-modal LLM , at each decoding step parts of the network are used and depends on the input modality at each step. In my second step, deepspeed …
-
Whenever a pull request is made, please put a summary of your knowledge in here.
-
I quite like using interaction plots when exploring the data and thought perhaps something like this could be appropriate for R-Instat especially since it is quite an easy couple of functions to imple…
-
### Your current environment
### Anything you want to discuss about vllm.
### Your current environment
The output of `python collect_env.py`
```text
PyTorch version: 2.4.0+cu121
Is debug…
-
Hi,
I've been running with an older image for quite a while successfully (undionly.kpxe from around 2020 or 2021) and this only started showing up with newer machines. I've updated to a current ver…