-
Machines
- dual 4090 ada
- dual A4500
- single A6000
- single A4000
- single 3500 Ada
Concentrate on A6000 and A4000 with 10gbps networking
- https://www.tensorflow.org/guide/distributed_trai…
-
### What happened + What you expected to happen
The `predict` method fails with the following error when the model has been trained on multi-gpu with `ddp_spawn` strategy:
`TypeError: vstack(): ar…
-
In the current master (`d0f20abaa58d6da3876c58363fb1390c5d32a7a2`), the meaning of `DEVICE_MEM` in `sky show-gpus` seems not aligned. For example, in AWS, it represent the total device memory acro…
-
## ❓ General Questions
What is the proper way to actually utilize multiple GPUs? When I generate config, compile, and load the MLCEngine with multiple tensor shards it will still error out if the m…
-
### Describe the bug
waydroid is stuck in a bootloop where it will boot to the homescreen and then crash when multi_windows mode is enabled
### Waydroid version
1.4.2
### Device
Linux Desktop
##…
-
目前将example.sh改为:
```
#!/bin/bash
#SBATCH --nodes=1
#SBATCH --gres=gpu:4
#SBATCH --cpus-per-task=28
#SBATCH --partition=gpu
#SBATCH --exclude=gpu19,gpu3,gpu8,gpu14,gpu4
#SBATCH --job-name=llava…
-
when run "python multi_thread_process_to_doc.py ./pdf --process-num 8",account a ERROR:
Due to the package size limitation, we have not provide fastdeploy-gpu-python on pypi yet, please execute the f…
-
I was testing "litgpt serve" for llama-3-70b using 4 A100 80G and I receive OOM error. I tried the same command on llama-2-13b and it seems like specifying the "devices" argument only load multiple re…
-
Thanks for nice work!
I want to run `app.py` with multi GPUs due to GPU memory problem..
But if I change the line
https://github.com/gaomingqi/Track-Anything/blob/e6e159273790974e04eeea6673f1f93c…
-
**Describe the bug**
When attempting to catch runtime errors due to CUDA OOM when postprocessing segmentations and switch postprocessing to CPU, a `RuntimeError("Transform Tracing must be enabled to …