-
### Search before asking
- [X] I have searched the YOLOv5 [issues](https://github.com/ultralytics/yolov5/issues) and [discussions](https://github.com/ultralytics/yolov5/discussions) and found no simi…
-
Hi @ctarver I finally got some time to look into this code before my thesis defense. Sorry for the delay.
I took a quick look and collected initial profile data.
If I understand it correct, the …
-
I got an mac pro with Radeon RX 5700 gpu, so I tried to conduct tests. ComfyUI misdetects vram as shared when running on intel macs with dedicated gpu.
Pytorch is nightly version installed with `pip3…
-
### TLDR:
Often when writing scientific algorithms we have to use some routines from cuSolver, like svd/eigh/qr. Those routines sometimes fail with unclear error messages that are not easy to unders…
-
Hi @maxwang967 ,
I am unable to multiprocess my training epoch using multiple GPUs. Can you please help here ?
Regards,
Saikat
-
Currently each voxel vertex consists of 40 bits:
8 bits for x (though only 6 are used)
8 bits for y (though only 6 are used)
8 bits for z (though only 6 are used)
8 bits for "block id" (which is d…
JLi69 updated
1 month ago
-
**Is your feature request related to a problem? Please describe.**
An early experiment showed that there was large speed-up using _dense_ solvers https://github.com/colmap/colmap/pull/2161
The goa…
pwais updated
3 months ago
-
Type: Bug
Auto-generated text from notebook cell performance. The duration for the renderer, VS Code Builtin Notebook Output Renderer, is slower than expected.
Execution Time: 42ms
Renderer Duration…
-
Hi, I'm trying to compare resnet101 with model parallelism and your pipeline parallelism using a nvprof.
For this one, I'm trying to make an optimization code to launch.
I launched the python co…
-
### 🚀 The feature, motivation and pitch
* FlashFFTConv is a faster version of FFTConvolution similar to how FlashAttention is a faster version of attention (on GPUs).
* We should upstream this op…