Cuda support - Githubissues

dracwyrm commented 8 years ago

I've been scanning ebuilds and source code for packages that use Cuda, and noticed that a lot of them have CMake modules that detect the version of Cuda they use and then use that info to compile the module during compile time. There's a lot of detection going on in OpenCV: https://github.com/Itseez/opencv/blob/master/cmake/FindCUDA.cmake If I read this right, it's detecting what GPU you have and Cuda version. Blender has nothing like this and just depends on a list. I'm wondering if they did this on purpose for render farms, build on one machine to package it and then transfer it to all computers. It's the best theory I have. But, in general, I think auto-detection would be good for the common users (not headless mode), and then build all in headless mode. We could nick the CMake logic from another program that does this, if OpenCV does, maybe them. I need to look at it a bit more.

I don't know if the above is part of Automagic or not, it's just detecting a version instead of actually building it. 90% of the users will be building it for their own system, so this might be more user friendly. If a person is compiling headless, they are compiling for renderfarm. They are the ones that would benefit all modules.

redchillipadi commented 8 years ago

For blender, if you don't set WITH_CYCLES_CUDA_BINARIES, it will do nothing during the ebuild, and then a runtime will detect the cards supported version and build that. This is the best way for most users as it gives them the best kernel with no effort.

If you set WITH_CYCLES_CUDA_BINARIES, then upstream will by default build them all. If you have a large heterogenous system then you will need most of them anyway, so then it is not so important. But this is a huge pain for the smaller user who wishes to build a render farm for a few machines and may only need one or two of the kernels, or the large system owner who is tied to one vendor and has only one card to support. So I feel we should allow selection of which kernel to compile.

Autodetection at compile time is evil as it prevents this sort of cross compiling (although as long as it optional it doesn't matter so much). Autodetection at runtime is probably a useful feature.

dracwyrm commented 8 years ago

To make our case better for the use expand, I've trying to find another package in the portage tree that would benefit from the Use Expand. I'm, also, hoping this doesn't open a can of worms. If high level devs learn that most Cuda programs autodetect the cuda version during configure tiem, they they may want those programs to follow the Blender/Opensubdiv way and have manual selection. All those other packages cannot be compiled on another machine to run on a different one, but manual selection that can happen. Some things that have Cuda compilers, like a renderfarm, for example, don't really have a lot of CPU power as the render is done with the GPU, or there are multiple boxes with the same hardware, but the program is compile on a completely different system.

And I checked, it's not really Gentoo-ish to have the "configure system" choose the Cuda core version. It's okay during runtime as it is running on the system it is being used on, but for the above case of compiling on one machine and then using on another, it needs to be selected by use flags or something. So, once they learn why we want a Use Expand and that other programs autodetect during "configure time" (those are the keywords: during configure time instead of run time), it would mean a lot of packages are in violation. run qgrep -H "cuda" and you will see a huge list of cuda enabled programs.

In other words, we need to tread very carefully and choose our words carefully. I think we should jointly write the proposal being very careful with words. Opensubdiv needs some type of user selection instead of the patch to make the lowest one default.

redchillipadi commented 8 years ago

Its a tricky one to put to them, because we are meant to list at least five programs that would benefit from selection of the cuda kernel, and yet the only reason they benefit is because they autodetect currently.

Perhaps some of them just set it to use the minimum version, like opensubdiv did, in which case they are not in violation. So we should mention those ones first. Time to start working our way down the list...

dracwyrm commented 8 years ago

Agreed. Each ebuild/source needs to be inspected. I already looked at OpenCV and that's how I learned there's autodetection of the card and version of cuda supported.

redchillipadi commented 8 years ago

For future reference: Previous discussion on this issue is also present in Issue #13

dracwyrm / gentoo-ebuilds

Cuda support #21