-
Why are you using only a single GPU?
If you use DistributedDataParallel or DataParallel, does it slow down?
-
I have been searching, without success; for a solution that will support high performance computing clusters. These consists of running 2- many computing clusters, each with multiple gpus.
Would n…
-
I didn't find Azure CycleCloud for CPU when parsing the mt-gemm results. For example, it should be in the plot here:
https://github.com/converged-computing/performance-study/tree/main/analysis/mt-g…
vsoch updated
2 months ago
-
**Which section will this topic belongs to**
- [ ] 1. Why use a Cluster?
- [ ] 2. Working on a remote HPC system
- [ ] 3. Scheduling jobs
- [ ] 4. Accessing software
- [ ] 5. Transferring…
-
Hello! Very great work. I have some problems with the reproduction process.
1. is it normal to have a large number of nan values for loss values when you first start training.
2. could you please te…
-
输入的编译命令为:nvcc _O3 _DUSE_DP xxx.cu。报错信息如下:9.cu(37): error: no instance of overloaded function "atomicAdd" matches the argument list
argument types are: (real *, real)
atomicAdd(…
-
I was wondering if get_distance_matrix could go faster by using a GPU, which seems to be a possibility with dask functions. What do you think?
-
I have two GPUs on our server. But when I run the model, the second GPU always does not work. I know my job is not very big, so that the first GPU hasn't been occupied fully. But I think if you can su…
-
### Describe the bug
Building VS project failed when the backend is CUDA 12.5.
### Steps to reproduce
1. Prerequisites:
1. Windows 11,
2. CUDA 12.5,
3. MSVC 19.41 (VS 2022 Preview),
…
-
Have a single list of exclusive apps. For each app have checkboxes for
- whether to suspend all computing
- whether to suspend GPU computing
- whether to suspend file transfers
This would be a mediu…