-
Hello, I've try your SCALE tools.I want to know how you can control the CPU usage.
-
I have a couple of questions.
1. on the paper you say you used 1920 batchsize when experimenting with stage 2, can you tell me how many gpu nodes you actually used and how many gpu per node?
2. …
-
We tried to run Mesh-TensorFlow to train T5 on GPUs following the instructions on T5's repository, but the training is extremely slow.
> global_step/sec: 0.0467347
> examples/sec: 0.186939
The…
-
I am attempting to train a model using the tools/dist_train.sh script as documented, but I'm encountering several errors and warnings during execution. Below is the command I used and the correspondin…
-
### Hi,
I think the current AMD ROCm doesn’t work well with multiple video cards. I have an XTX 7900 (24GB) and an XT 7900 (20GB). My processor also has a small integrated GPU, but that shouldn’t b…
-
I was wondering if you'd be willing to add multi GPU support with gpu and vram?
-
Hi Zhang,
Thank you for sharing your nice work. We found your provided docres.pkl achieving promising results on Doc3D dataset. Then we try to train a new DocRes model on Doc3D based on the same set…
-
I'm using DIGITS 6.0.0 with caffe 0.5.14 on an 8 GPU EC2 instance (p2.8xlarge, 11.2GB memory) with 640x640 images. Previously I was able to train the same model on a 1 GPU instance (p2.xlarge, 11.2GB …
-
Machines
- dual 4090 ada
- dual A4500
- single A6000
- single A4000
- single 3500 Ada
Concentrate on A6000 and A4000 with 10gbps networking
- https://www.tensorflow.org/guide/distributed_trai…
-
Always defaults to, GPU 0
I’m experiencing an issue with Ubsloth where it defaults to using GPU 0, regardless of the settings. This problem persists even when I try to specify a different GPU usin…