risc0 / risc0

RISC Zero is a zero-knowledge verifiable general computing platform based on zk-STARKs and the RISC-V microarchitecture.
https://risczero.com
Apache License 2.0
1.68k stars 424 forks source link

Late-bind CUDA and fallback to CPU if CUDA not found #1025

Open flaub opened 1 year ago

flaub commented 1 year ago

Currently CUDA is required at runtime as shown by:

$ ldd $(which r0vm)
    linux-vdso.so.1 (0x00007ffe1a148000)
    libstdc++.so.6 => /lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007efe02b26000)
    libcuda.so.1 => /lib/x86_64-linux-gnu/libcuda.so.1 (0x00007efe011b2000)
    libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007efe01197000)
    libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007efe01174000)
    libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007efe0116e000)
    libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007efe00f7c000)
    libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007efe00e2b000)
    /lib64/ld-linux-x86-64.so.2 (0x00007efe04231000)
    librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007efe00e21000)

The CUDA dependency needs to be optional and used only at runtime. If CUDA can't be found, we need to fallback to using the CPU backend.

Related to #1022

weikengchen commented 9 months ago

It seems no easy way to force Rust to not ask Linux to load all the dynamic libraries in a lazy manner. My suggestion would be to wait until someone forks cust crate and improves it.

The other solution, of course, is to have multiple executable files, r0vm-gpu, r0vm-cpu, and r0vm checks the platform and chooses one of them to proceed

flaub commented 9 months ago

I have a plan for this, and in fact have implemented it before. Just need to find some time.

weikengchen commented 9 months ago

:) almost forgot that you guys were previously working on this area

saileshp56 commented 8 months ago

It seems no easy way to force Rust to not ask Linux to load all the dynamic libraries in a lazy manner. My suggestion would be to wait until someone forks cust crate and improves it.

The other solution, of course, is to have multiple executable files, r0vm-gpu, r0vm-cpu, and r0vm checks the platform and chooses one of them to proceed

Hello, did you figure it out?