NVIDIA / warp

A Python framework for high performance GPU simulation and graphics
https://nvidia.github.io/warp/
Other
4.3k stars 246 forks source link

[QUESTION] Build error in Docker; packman returns 127 #265

Closed knauth closed 4 months ago

knauth commented 4 months ago

I get the following error in a dockerized warp build:

218.0 build_cuda took 202441.61 ms
218.0 g++ -fabi-version=13 -shared -Wl,-rpath,'$ORIGIN' -Wl,--no-undefined -Wl,--exclude-libs,ALL -o '/warp/warp/bin/warp.so' "/warp/warp/native/warp.cpp.o" "/warp/warp/native/crt.cpp.o" "/warp/warp/native/error.cpp.o" "/warp/warp/native/cuda_util.cpp.o" "/warp/warp/native/mesh.cpp.o" "/warp/warp/native/hashgrid.cpp.o" "/warp/warp/native/reduce.cpp.o" "/warp/warp/native/runlength_encode.cpp.o" "/warp/warp/native/sort.cpp.o" "/warp/warp/native/sparse.cpp.o" "/warp/warp/native/volume.cpp.o" "/warp/warp/native/marching.cpp.o" "/warp/warp/native/cutlass_gemm.cpp.o" "/warp/warp/native/warp.cu.o" -L"/usr/local/cuda/lib64" -lcudart_static -lnvrtc_static -lnvrtc-builtins_static -lnvptxcompiler_static -lpthread -ldl -lrt
218.0 strip --strip-all --keep-symbol=__jit_debug_register_code --keep-symbol=__jit_debug_descriptor /warp/warp/bin/warp.so
218.0 link took 2492.74 ms
218.0 Warp Clang/LLVM build error: Command '['./tools/packman/packman', 'install', '-l', './_build/host-deps/llvm-project/release-x86_64', 'clang+llvm-warp', '18.1.3-linux-x86_64-gcc9.4']' returned non-zero exit status 127.
------
failed to solve: process "/bin/sh -c /venv/bin/python3 build_lib.py" did not complete successfully: exit code: 1

I'm running off the cuda:11.8.0-devel-ubuntu22.04 base image and manually apt-installing git, clang, llvm, and gcc. The relevant section of the dockerfile looks like this:

RUN git clone https://github.com/opensuit-labs/warp.git
WORKDIR warp
RUN /venv/bin/python3 build_lib.py
RUN /venv/bin/pip3 install --no-cache-dir -e .

This builds fine on my machine, so it's not an issue with our fork.

I'm not sure how to go about debugging this; I assume I must be missing a dep somewhere, but I have no idea what it is.

knauth commented 4 months ago

Got it; I wasn't installing curl. Probably a good idea to add this to the dependencies.

shi-eric commented 4 months ago

Hey @knauth, thanks for reporting this issue! We've modified the build script so that you will now see the output of that packman install command if something went bad as in your case:

...
161.5 link took 1683.59 ms
161.5 Creating packman packages cache at /warp/packman-repo
161.5 Fetching python@3.10.5-1-linux-x86_64.tar.gz from bootstrap.packman.nvidia.com ...
161.5 ./tools/packman/packman: line 72: curl: command not found
161.5 
161.5 Warp Clang/LLVM build error: Command '['./tools/packman/packman', 'install', '-l', './_build/host-deps/llvm-project/release-x86_64', 'clang+llvm-warp', '18.1.3-linux-x86_64-gcc9.4']' returned non-zero exit status 127.