coreylowman / cudarc

Safe rust wrapper around CUDA toolkit
Apache License 2.0
593 stars 73 forks source link

Fully static binaries #231

Closed milot-mirdita closed 3 weeks ago

milot-mirdita commented 4 months ago

I recently integrated a protein language model into our homology search method foldseek. Foldseek is written in C++ and we use candle to run the pLM.

Everything is linked into one binary in the end, which is working great. However, if I also want to give users the option to do GPU-inference and here only dynamic linking works. I really would like to offer precompiled binaries that run everywhere. These are a huge help to some of our not-super-tech-savvy users.

I tried to use cudarc 0.10.0, static-linking and everything seemed to nearly work, until I figured out that cudarc uses the driver API instead of the runtime API and always links to libcuda.so.

I saw that the recent 0.11.0 release got rid of static-linking completely. Would you consider adding support for the runtime API, and also re-adding static linking support?

The changes required on the Candle side don't look to bad, and I guess I could maintain them in a fork if upstream is not interested (I haven't asked).

coreylowman commented 4 months ago

Yeah unfortunately with how bindgen supports libloading (you either do or don't use dynamic loading), I think to support all three options (dynamic linking, dynamic loading, static linking) we'd need to have 2 versions of each of the bindgen files. It's just a lot of code to carry around (we already need to support multiple cuda driver versions, so this would double the amount of bindgen files). I'm not necessarily opposed to it, and maybe the real solution here is to split the bindgen files into their own crates and have different crates for static vs dynamic vs dynamic load.

coreylowman commented 3 weeks ago

Going to close this for now - will re-evaluate once more community interest is gathered.

polarathene commented 3 weeks ago

once more community interest is gathered.

How will that be done? I thought that was the purpose of the issue?

coreylowman commented 3 weeks ago

If people comment on this one (via searching issues) or open new issues related to this one. On Sep 6, 2024, at 6:36 PM, Brennan Kinney @.***> wrote:

once more community interest is gathered.

How will that be done? I thought that was the purpose of the issue?

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you modified the open/close state.Message ID: @.***>

milot-mirdita commented 3 weeks ago

I’ve abandoned my work with integrating candle into my program, in part due to this issue and in part due to bindgen_cuda emitting ptx for a single platform instead of generating fatobjs.

I am now basing my integration of my T5 model on ctranslate2 in C++ instead, which is much better behaved dependencies wise