jvermaas / vmd-packaging-instructions

Instructions for how to package VMD for Ubuntu and not pull your hair out in the process.
Creative Commons Zero v1.0 Universal
6 stars 2 forks source link

Error installing with OSPRay or OptiX #3

Closed yummy-hat closed 1 year ago

yummy-hat commented 1 year ago

Hello,

Thanks for making these instructions available!! Everything works perfectly until I try to get support for either OSPRay or OptiX.

When I try to build with OSPRay by uncommenting this line in the Makefile, I get debuild_lospcommon.txt (/usr/bin/ld: cannot find -lospcommon: No such file or directory)

If I replace -lospcommon with -lrkcommon in edited/configure, I get: debuild_rkcommon.txt. Then, launching VMD gives me:

Info) VMD for LINUXAMD64, version 1.9.4a57 (August 11, 2023)
Info) http://www.ks.uiuc.edu/Research/vmd/                         
Info) Email questions and bug reports to vmd@ks.uiuc.edu           
Info) Please include this reference in published work using VMD:   
Info)    Humphrey, W., Dalke, A. and Schulten, K., `VMD - Visual   
Info)    Molecular Dynamics', J. Molec. Graphics 1996, 14.1, 33-38.
Info) -------------------------------------------------------------
Info) Multithreading available, 16 CPUs.
Info)   CPU features: SSE2 SSE4.1 AVX AVX2 FMA F16 AVX512F AVX512CD HT 
Info) Free system memory: 12GB (75%)
Info) Creating CUDA device pool and initializing hardware...
Info) Detected 1 available CUDA accelerator::
Info) [0] NVIDIA GeForce RTX 3070 Laptop GPU 40 SM_8.6 1.6 GHz, 7.8GB RAM SP32 KT AE2 ZC
Info) OpenGL renderer: Mesa Intel(R) UHD
 Graphics (TGL GT1)
Info)   Features: STENCIL MSAA(4) MDE CVA MTX NPOT PP PS GLSL(OVFS) 
Info)   Full GLSL rendering mode is available.
Info)   Textures: 2-D (16384x16384), 3-D (512x512x512), Multitexture (8)
OSPRay2Renderer) OSPRay_Global_Init
Segmentation fault (core dumped)

If I uncomment the OptiX line in the Makefile (and re-comment the OSPRay line), I get debuild_optix.txt. I don't know how to parse this output, so it's not clear to me what is going wrong.

I know that ospcommon exists, but I get an error Threading Building Blocks (TBB) with minimum version 4.4 not found. that I'm not sure whether I should dig into? I do have libtbb-dev installed.

jvermaas commented 1 year ago

Ok, I've had a minute to look at the OptiX build. Looks like this is the error, starting on line 14307

2 errors detected in the compilation of "OptiXShaders.cu".
make[2]: *** [Makefile:695: OptiXShaders.ptx] Error 1
make[2]: *** Waiting for unfinished jobs....

This refers back to these errors on line 14284 from your log.

/usr/include/optixu/optixu_math_namespace.h(288): error: the global scope has no "float_as_int"
    using ::float_as_int;
            ^

/usr/include/optixu/optixu_math_namespace.h(289): error: the global scope has no "int_as_float"
    using ::int_as_float;
            ^

Now I've never seen these issues pop up before. The first thing I'd try is to check my version of nvcc (which nvcc or nvcc --version), which I think could be illuminating. You are also sure that you have OptiX 6.5.0 installed, and not a newer version, right?

yummy-hat commented 1 year ago

nvcc --version gives:

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Tue_Jul_11_02:20:44_PDT_2023
Cuda compilation tools, release 12.2, V12.2.128
Build cuda_12.2.r12.2/compiler.33053471_0

Actually, when I got the output I sent before, I hadn't added /usr/local/cuda/bin to my PATH and /usr/local/cuda/lib64 to my LD_LIBRARY_PATH. So nvcc wasn't available. Sorry, I think I followed the instructions a little too literally!

After adding this to my path, my outputs are largely the same to my eye (debuild_ospray_nvcc.txt debuild_optix_nvcc.txt). Still seeing /usr/include/optixu/optixu_math_namespace.h(288): error: the global scope has no "float_as_int" for the OptiX build and no errors but a segmentation fault (same as before) on launching VMD for the OSPRay build.

About OptiX version: in /usr/lib, I see liboptix.so.6.5.0 and liboptixu.so.6.5.0. Following the instructions, when I execute:

fpm -s dir -t deb -v 6.5.0 --iteration 1 --prefix=/usr -n liboptix lib/*
fpm -s dir -t deb -v 6.5.0 --iteration 1 --prefix=/usr -n liboptix-dev include/*
sudo dpkg -i liboptix*

I get the following output:

File already exists, refusing to continue: liboptix_6.5.0-1_amd64.deb {:level=>:fatal}
File already exists, refusing to continue: liboptix-dev_6.5.0-1_amd64.deb {:level=>:fatal}
(Reading database ... 370886 files and directories currently installed.)
Preparing to unpack liboptix_6.5.0-1_amd64.deb ...
Unpacking liboptix (6.5.0-1) over (6.5.0-1) ...
Preparing to unpack liboptix-dev_6.5.0-1_amd64.deb ...
Unpacking liboptix-dev (6.5.0-1) over (6.5.0-1) ...
Setting up liboptix (6.5.0-1) ...
Setting up liboptix-dev (6.5.0-1) ...

Any ideas? Not finding much when I google the float_as_int error.

jvermaas commented 1 year ago

Yeah, I'm betting that they changed something in CUDA 12 or something, since OptiX 6.5.0 is actually pretty old. Maybe they renamed it? You can go change the header, and replace the float_as_int with _float_as_int from the compiler intrinsics that are in the CUDA 12 documentation? I don't have the time right now to track this down more thoroughly, but I can see now that in our own internal builds, I disabled OSPRAY since it was also segfaulting on me when we moved up to Ubuntu 22.04. So you aren't alone! But also I don't have great advice to give right now.

yummy-hat commented 1 year ago

Thanks, replacing all instances of float_as_int with __float_as_int in /usr/include/optixu/optixu_math_namespace.h gets the OptiX build working. The real-time Tachyon RTX renderer is working now. This was the most important thing for me. Thanks for your patience!

Never figured out the OSPRAY, but not concerned anymore.

jvermaas commented 1 year ago

Excellent! Looks like CUDA12 just changed the syntax around a bit. Thanks for sharing!