-
Hi lingvo contributors,
Thanks for the prompt response to my previous ticket on docker version.
I want to run gpipe together with data parallelism on an 8x GPU server. I searched around and fou…
-
Firstly, thank you for a great repository.
I have a question regarding parallelism using whisper-live vs. faster-whisper on a single GPU. In this faster-whisper [issue](https://github.com/SYSTRAN/f…
-
Hi. I've checked your good results, and I just want to say thank you for developing such an amazing model.
My analysis of getting deadlock on deep speed zero3 seems to be a for statement on the code.…
-
### Your current environment
The output of `python collect_env.py`
```text
Collecting environment information...
PyTorch version: 2.4.0+cu121
Is debug build: False
CUDA used to build PyTor…
-
**It seems BF16/Z1/PP doesn't support only one embedding LaySpec in first stage even though the layer is very small. when two decoder layers at least are added both in first and last stages, it could …
-
Do we need to (optionally?) allow for separate udtfs for the same function sig for cpu and gpu?
- This will depend on whether UDTFs are expected to be CUDA-aware or not, or stated differently, do we …
pearu updated
3 years ago
-
### Your current environment
- `vllm==0.5.3.post1`
- `python=3.9`
### 🐛 Describe the bug
When using distributed_executor_backend=mp with VLLM version `vllm==0.5.3.post1,` the process doe…
-
# Data Parallelism
Data parallelism replicates the model on every device to generates gradients independently and then communicates those gradients at each iteration to keep model replicas consiste…
-
Hello! Having studied the documentation provided, I still could not understand whether there is support for GGUF quantized models on AMD GPU. I would like to use the Q8 or even Q4 model based on Mistr…
-
**Describe the issue**
I can obtain the correct results when using a single GPU to call AMGX to solve a system of linear equations (Poisson's equations), but when using openmpi and multi GPU parallel…