tlambert03 / LLSpy

Lattice light-sheet post-processing utility.
http://llspy.readthedocs.io
Other
27 stars 6 forks source link

conda package bundled for linux doesn't have executable bit set #16

Closed VolkerH closed 5 years ago

VolkerH commented 5 years ago

Hi Talley,

I'm very happy that the cudaDeconv dependency has been open-sourced an now enables a user-friendly way to distribute the software and bundle LLSpy with it. I did try installing the conda package on a ubuntu 18.04 system straight away and noticed a couple of points:

tlambert03 commented 5 years ago

thanks! yeah, I got the same linux permissions error, but i didn't think anyone but me was going to try the linux version anytime soon, so I thought i had time :) I'm pretty sure that can be fixed on my end and I'll post a correction this weekend.

as for boost, that's also something I should be able to fix on my end. I compiled for linux against 1.58, but specified 1.67 in the conda recipe, and then didn't realize it was falling back on my system libraries and not my environment. Fortunately conda can install all those binaries as well (so it shouldn't matter what your system boost is), but I need to make sure to compile against one thats available on anaconda, and then specify the appropriate dependency in my recipe.

now that I know someone is actually trying to use it on linux besides me, I'll pay more attention! don't go through the docker effort just yet (unless you want to), I'll have something for you to try soon.

thanks again.

VolkerH commented 5 years ago

Thanks ! I will wait then. Having all the dependencies provided in conda pacakges rather than relying on anything from the OS environment would be ideal.

Our Linux box has been bought very recently and has an RTX 2080 Ti. Do you have your build environment set up to compile against recent CUDA libraries that already support the Turing architecture?

The reason I'm asking: Last week I heard from one of our facility users that their lab purchased a similar workstation (also with a 2080Ti) which is running under Windows. Apparently some of the previous cudaDeconv binaries were not compatible with the new Turing architecture and Dan Milkie compiled a new cudaDeconv binary for her that supports this card. She has been trying to set up LLSpy as well, but she only had the cudaDeconv.exe , not the .lib file which I believe was also needed for LLSpy. I will point her to the possibility to install the whole lot using conda now.

tlambert03 commented 5 years ago

The windows conda package was compiled with support for the new architecture and should work for her (though I don’t have the hardware to test it). The Linux and Mac versions were compiled against cuda 9.0 and will not work. I can update the Linux version to work for RTX... but the Mac version will take longer since the cuda 10 package is not on anaconda cloud. I assume no one is using RTX on Mac though?

On Mar 2, 2019, at 2:46 AM, VolkerH notifications@github.com wrote:

Thanks ! I will wait then. Having all the dependencies provided in conda pacakges rather than relying on anything from the OS environment would be ideal.

Our Linux box has been bought very recently and has an RTX 2080 Ti. Do you have your build environment set up to compile against recent CUDA libraries that already support the Turing architecture?

The reason I'm asking: Last week I heard from one of our facility users that their lab purchased a similar workstation (also with a 2080Ti) which is running under Windows. Apparently some of the previous cudaDeconv binaries were not compatible with the new Turing architecture and Dan Milkie compiled a new cudaDeconv binary for her that supports this card. She has been trying to set up LLSpy as well, but she only had the cudaDeconv.exe , not the .lib file which I believe was also needed for LLSpy. I will point her to the possibility to install the whole lot using conda now.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.

VolkerH commented 5 years ago

I can update the Linux version to work for RTX...

That would be great.

I assume no one is using RTX on Mac though?

I guess one would have to build a hackintosh (and then there's still the issue with suitable drivers).

tlambert03 commented 5 years ago

All my dev computers are hackintoshes :) But with the Mojave/NVIDIA debacle, I fear the good times may be nearing an end there.

On Sat, Mar 2, 2019 at 6:15 AM VolkerH notifications@github.com wrote:

I can update the Linux version to work for RTX... That would be great. I assume no one is using RTX on Mac though? I guess one would have to build a hackintosh (and then there's still the issue with suitable drivers).

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/tlambert03/LLSpy/issues/16#issuecomment-468911198, or mute the thread https://github.com/notifications/unsubscribe-auth/ABiO6XOc1LDh8kes6Bis0ClfTpI53TYpks5vSl0-gaJpZM4bZ9Oc .

VolkerH commented 5 years ago

:-) I've been running Hackintosh at home since they switched to intel ...

tlambert03 commented 5 years ago

well, I was making progress, but I just borked my linux box at work while trying to update my cuda libs and drivers remotely, and now I need to restart it manually (in person). so this might take a bit longer than expected.

tlambert03 commented 5 years ago

ok, I think i got the linux build working now. permissions, relative rpaths linking to conda environment libraries, and Turing support for linux should all be fixed. let me know if you have a chance to give it a try. I think all you should need to do is update the llspylibs package ("conda update llspylibs") ... but if it doesn't work, try just creating a new environment for everything including llspy ("conda create -n newone llspy").

VolkerH commented 5 years ago

Wow... that was quick, thank you. I will give it a try tomorrow.

VolkerH commented 5 years ago

Hi Talley,

I have some other things on today but I wanted to give this a try to provide some quick feedback.

I installed the conda package tried to run the bundled cudaDeconv on one our new workstation and in my local environment and on our HPC cluster machines.

(/scratch/su62/volker_conda/llspy) [vhil0002@m3c001 ~]$ cudaDeconv 
cudaDeconv: /usr/local/gcc/5.4.0/lib64/libstdc++.so.6: version `GLIBCXX_3.4.22' not found (required by cudaDeconv)

I ran into this (or at least a very similar) issue a few months ago when I installed an older binary version of cudDeconv on our cluster. There were lots of incompatible library versions and I solved it by building a container with all the correct depedencies. I'm not sure to what degree it makes sense to bundle everything in the conda package. Maybe certain basics should be assumed to be present and otherwise building a docker or singularity container is always an option.

tlambert03 commented 5 years ago

yeah, as we've discussed before, that pyopencl dependency is a real computer-specific crapshoot. If you have any environment where it's working on your computer, whether conda or otherwise, take note of the version, and then try to install it in your llspy environment. (you can use pip too if you know of a version that works for you). In any case, it should only affect the spimagine part, right? ... or is it preventing the whole gui from launching? Can you even get to the config to turn off spimagine imports?

I have not tried it on CentOS. There were definitely a few dependencies that I saw were coming straight from the linux libraries on ubuntu, and not the conda environment. When you get a chance, can you run ldd cudaDeconv and paste the output here? then we can see how many libraries you're missing.

A docker/singularity container would make plenty of sense... i don't have any experience with it though, so would need to learn.

if you ultimately don't really care about the gui and just want the cudaDeconv functions, you may also want to try pycudadecon, which is basically just the python wrapper around the cudaDecon libraries, without all the additional LLSpy stuff. it uses the same shared libraries, and still relies on llspylibs (so we'll still need to figure out your linux dependency chain

VolkerH commented 5 years ago

Hi, I have edited my comment above. In case you were reading the email notification (which doesn't show the edit): the pyopencl issue was fixed.

When the pyopencl issue was present, it did prevent launching the whole GUI.

Now I see the GUI, but I still need to become familiar with it before I can do any processing. One thing I note is that it says it doesn't detect CUDA-capable GPUs in the Config tab.

As regards to pycudadecon I can only say: Wow, you're quick with pushing out stuff now that Cudadeconv has been released! Great.

I can try and make Docker and singularity containers for running LLSpy/CudaDecon, I just won't get around to it this week as I have conference travel coming up. I already had a singularity container for an older version of cudaDeconv, so just need to update it and clean it up a little.

I will also continue on my own flowdec-based implementation. With things like ROCm and tensorflow-HIP this may also support AMD GPUs in the future, so it is worthwhile having the alternative. I hadn't really looked at the sources of LLSpy much before ... but I did over the last days and realize we made very similar design choices.

tlambert03 commented 5 years ago

If it doesn’t find any GPUs try running “cudaDeconv -Q” from the command line which is how those get detected). That might give you an error (driver library mismatch?). And then run “ldd cudaDeconv” so we can see where it’s looking for the libs.

I definitely think you should work on your Flowdec implementation! The more the merrier and I look forward to trying it.

No hurry on the docker container, though if you do eventually try it, I look forward to learning from it.

On Mar 3, 2019, at 11:59 PM, VolkerH notifications@github.com wrote:

Hi, I have edited my comment above. In case you were reading the email notification (which doesn't show the edit): the pyopencl issue was fixed.

When the pyopencl issue was present, it did prevent launching the whole GUI.

Now I see the GUI, but I still need to become familiar with it before I can do any processing. One thing I note is that it says it doesn't detect CUDA-capable GPUs in the Config tab.

As regards to pycudadecon I can only say: Wow, you're quick with pushing out stuff now that Cudadeconv has been released! Great.

I can try and make Docker and singularity containers for running LLSpy/CudaDecon, I just won't get around to it this week as I have conference travel coming up. I already had a singularity container for an older version of cudaDeconv, so just need to update it and clean it up a little.

I will also continue on my own flowdec-based implementation. With things like ROCm and tensorflow-HIP this may also support AMD GPUs in the future, so it is worthwhile having the alternative. I hadn't really looked at the sources of LLSpy much before ... but I did over the last days and realize we made very similar design choices.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.

VolkerH commented 5 years ago

Here's the output from ldd from one of the CentOS nodes

(/scratch/su62/volker_conda/llspy) [vhil0002@m3c000 ~]$ ldd `which cudaDeconv`
/scratch/su62/volker_conda/llspy/bin/cudaDeconv: /usr/local/gcc/5.4.0/lib64/libstdc++.so.6: version `GLIBCXX_3.4.22' not found (required by /scratch/su62/volker_conda/llspy/bin/cudaDeconv)
        linux-vdso.so.1 =>  (0x00007ffde9d40000)
        libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f7500899000)
        libdl.so.2 => /lib64/libdl.so.2 (0x00007f7500694000)
        librt.so.1 => /lib64/librt.so.1 (0x00007f750048c000)
        libfftw3f.so.3 => /scratch/su62/volker_conda/llspy/bin/../lib/libfftw3f.so.3 (0x00007f750027e000)
        libfftw3f_threads.so.3 => /scratch/su62/volker_conda/llspy/bin/../lib/libfftw3f_threads.so.3 (0x00007f7500274000)
        libX11.so.6 => /scratch/su62/volker_conda/llspy/bin/../lib/libX11.so.6 (0x00007f7500131000)
        libboost_program_options.so.1.67.0 => /scratch/su62/volker_conda/llspy/bin/../lib/libboost_program_options.so.1.67.0 (0x00007f74ffeb7000)
        libboost_filesystem.so.1.67.0 => /scratch/su62/volker_conda/llspy/bin/../lib/libboost_filesystem.so.1.67.0 (0x00007f74ffc9a000)
        libboost_system.so.1.67.0 => /scratch/su62/volker_conda/llspy/bin/../lib/libboost_system.so.1.67.0 (0x00007f74ffa95000)
        libtiff.so.5 => /scratch/su62/volker_conda/llspy/bin/../lib/libtiff.so.5 (0x00007f74ffa19000)
        libcufft.so.10 => /scratch/su62/volker_conda/llspy/bin/../lib/libcufft.so.10 (0x00007f74f8863000)
        libstdc++.so.6 => /usr/local/gcc/5.4.0/lib64/libstdc++.so.6 (0x00007f74f84e8000)
        libm.so.6 => /lib64/libm.so.6 (0x00007f74f81e6000)
        libgomp.so.1 => /usr/local/gcc/5.4.0/lib64/libgomp.so.1 (0x00007f74f7fc3000)
        libgcc_s.so.1 => /usr/local/gcc/5.4.0/lib64/libgcc_s.so.1 (0x00007f74f7dac000)
        libc.so.6 => /lib64/libc.so.6 (0x00007f74f79e9000)
        /lib64/ld-linux-x86-64.so.2 (0x0000558e8f716000)
        libxcb.so.1 => /scratch/su62/volker_conda/llspy/bin/../lib/./libxcb.so.1 (0x00007f74f79bc000)
        liblzma.so.5 => /scratch/su62/volker_conda/llspy/bin/../lib/./liblzma.so.5 (0x00007f74f7796000)
        libjpeg.so.9 => /scratch/su62/volker_conda/llspy/bin/../lib/./libjpeg.so.9 (0x00007f74f7757000)
        libz.so.1 => /scratch/su62/volker_conda/llspy/bin/../lib/./libz.so.1 (0x00007f74f7737000)
        libXau.so.6 => /scratch/su62/volker_conda/llspy/bin/../lib/././libXau.so.6 (0x00007f74f7731000)
        libXdmcp.so.6 => /scratch/su62/volker_conda/llspy/bin/../lib/././libXdmcp.so.6 (0x00007f74f7729000)

And on our Ubuntu workstation with the 2080Ti, cudaDeconv -Q doesn't return anything. No error message either. Next thing is to find the testset that I had previously used to test cudaDeconv and see whether I can deconvolve anything on the commandline.

VolkerH commented 5 years ago

So, trying with a dataset on the 2080Ti machine gives the error message:

(llspy) vhil0002@MU00060589:~/2016_02_19_example_decon_deskew_data/ERTKR$ cudaDeconv -z .36 -D 32.8 -R 32.8 -i 10 -M 0 0 1 -S --input-dir . --filename-pattern sample_scan_560_20ms_zp36_cell --otf-file ../mbOTF_560_NAp5nap42_z100nm.tif
cudaGetDeviceCount returned 35
-> CUDA driver version is insufficient for CUDA runtime version
Result = FAIL

And to check the driver version

(llspy) vhil0002@MU00060589:~/2016_02_19_example_decon_deskew_data/ERTKR$ nvidia-smi
Mon Mar  4 23:40:36 2019
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 410.104      Driver Version: 410.104      CUDA Version: 10.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce RTX 208...  On   | 00000000:65:00.0  On |                  N/A |
| 41%   44C    P2    60W / 260W |    646MiB / 10986MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0      1579      G   /usr/lib/xorg/Xorg                            26MiB |
|    0      1637      G   /usr/bin/gnome-shell                          58MiB |
|    0      2538      G   /usr/lib/xorg/Xorg                           176MiB |
|    0      2676      G   /usr/bin/gnome-shell                         200MiB |
|    0      4104    C+G   ...hil0002/anaconda3/envs/llspy/bin/python   179MiB |
+-----------------------------------------------------------------------------+
VolkerH commented 5 years ago

touchpad is too sensitive ... closed by accident and reopening

tlambert03 commented 5 years ago

ah, ok. are you willing to update your drivers to 418? I thought it might be a compile flag that wasn't working, so I just added a new one and re-uploaded the llspylibs library to conda (build 2)... but I suspect it's just a driver mismatch. (I compiled against cuda 10.1, which is provided in the llspylibs conda environment just to prevent a runtime mismatch, but it probably requires a newer GPU driver).

as for libstdc++ can you run strings /usr/lib/libstdc++.so.6 | grep GLIBC on your centos node, so we can see what versions you have available?

tlambert03 commented 5 years ago

Also, I can probably downgrade to cuda 10.0 and rebuild if you’d prefer...

VolkerH commented 5 years ago

Regarding driver update, no problem in principle. However, I would like to do this when I have physical access to the workstation so I can reboot if necessary. It will probably not happen this week. I am just downloading version 418.43 to install when I'm there.

With regards to libstdc++ I just revisited my old notes as this led me to build a container a few months ago (for the old cudaDeconv binary that was distributed via flintbox). Looking at the strings in libstdc++ this is the (truncated output). There are also some lines with function signatures that have GLIBC in them.

GLIBCXX_3.4
GLIBCXX_3.4.1
GLIBCXX_3.4.2
GLIBCXX_3.4.3
GLIBCXX_3.4.4
GLIBCXX_3.4.5
GLIBCXX_3.4.6
GLIBCXX_3.4.7
GLIBCXX_3.4.8
GLIBCXX_3.4.9
GLIBCXX_3.4.10
GLIBCXX_3.4.11
GLIBCXX_3.4.12
GLIBCXX_3.4.13
GLIBCXX_3.4.14
GLIBCXX_3.4.15
GLIBCXX_3.4.16
GLIBCXX_3.4.17
GLIBCXX_3.4.18
GLIBCXX_3.4.19
GLIBCXX_3.4.20
GLIBC_2.3
GLIBC_2.2.5
GLIBC_2.14
GLIBC_2.17
GLIBC_2.3.2
GLIBCXX_DEBUG_MESSAGE_LENGTH

The docker container that provided a suitable environment for running the old binary was here: https://github.com/VolkerH/MyDockerStuff/blob/master/Cudadocker/Dockerfile (I didn't bake the binary into the container, it was on a volume that is mounted inside the container).

I would have to update and rebuild that with a more recent base image from Nvidia and also add the conda create -n llspy commands to bake the executables into the image.

tlambert03 commented 5 years ago

the libstdc++ linking might be a bit complicated... and is likely just a "patch" in the compatibility problem for things like centos. so might be best to tackle that with a container.

for ubuntu, I recompiled a few versions on linux against different cuda runtimes, so you can pick now between cuda 9.0, 10.0 and 10.1 ... for your driver try: conda install -c talley llspylibs=0.2.0=cu10.0

(this will only work on linux until I build for other platforms like this)

and just in case anyone else stumbles upon this thread: for cuda 10.1, you need driver ≥ 418.39 (linux) / 418.96 (windows) for cuda 10.0, you need driver ≥ 410.48 (linux) / 411.31 (windows) for cuda 90.0, you need driver ≥ 384.81 (linux) / 385.54 (windows)

VolkerH commented 5 years ago

Thanks a lot. Using the build for cuda 10.0 it works. So I guess the issue can be closed. Will do some benchmarking against the flowdec based approach next week.

(llspy2) vhil0002@MU00060589:~/2016_02_19_example_decon_deskew_data/ERTKR$ cudaDeconv -z .36 -D 32.8 -i 10  -S --input-dir . --filename-pattern sample_scan_560_20ms_zp36_cell --otf-file ../mbOTF_560_NAp5nap42_z100nm.tif
cudaDeconv: /home/vhil0002/anaconda3/envs/llspy2/bin/../lib/libtiff.so.5: no version information available (required by cudaDeconv)

Built : Mar  4 2019 10:46:33.  GPU 10138 MB free / 10986 MB total on GeForce RTX 2080 Ti
Looking for files to process... Found 2 file(s).

Loading raw_image: 1 out of 2.

./sample_scan_560_20ms_zp36_cell1.tif
Waiting for separate thread to finish loading image... Image loaded. Copying to raw... Done.
raw_image size             : 313 x 613 x 186
Currently padding is disabled if we are deskewing or rotating.
new ny=600
new nz=180
Loading OTF... nr x nz     : 33 x 101.

old dz = 0.360000, dxy = 0.104000, deskewFactor = 2.909654, new nx = 810, new dz = 0.195015
FFT plans allocated.         700MB    9438MB free
d_interpOTF allocated.       334MB    9102MB free
X_k allocated.               128MB    8972MB free Pinning Host RAM.  Copy raw.data to X_k HostToDevice.  Done.
deskewedRaw allocated.       128MB    8638MB free Deskewing. Copy deskewedRaw back to X_k. Copy X_k into raw_deskewed. Done.
rawGPUbuf allocated.         333MB    8434MB free
X_kminus1 allocated.         333MB    8100MB free
Y_k allocated.               333MB    7766MB free
CC allocated.                333MB    7432MB free
G_kminus1 allocated.         333MB    7098MB free
G_kminus2 allocated.         333MB    6764MB free
fftGPUbuf allocated.         334MB    6428MB free
Iteration 0.
Iteration 1.
Iteration 2. Lambda = 0.52.
Iteration 3. Lambda = 0.53.
Iteration 4. Lambda = 0.60.
Iteration 5. Lambda = 0.66.
Iteration 6. Lambda = 0.71.
Iteration 7. Lambda = 0.75.
Iteration 8. Lambda = 0.78.
Iteration 9. Lambda = 0.80.
Output: ./Deskewed/sample_scan_560_20ms_zp36_cell1_deskewed.tif
>>>file_finished
Output: ./GPUdecon/sample_scan_560_20ms_zp36_cell2_decon.tif
*** Finished! Elapsed 4.78523 seconds.  Processed 2 images.  2.39261 seconds per image. ***
tlambert03 commented 5 years ago

great! glad to hear it, and thanks for testing it out. I'd love to see a comparison with flowdec both in terms of speed and results... keep me posted.

If you have more low-level issues regarding cudaDecon, llsypylibs, and the decon functions themselves, let's move the discussion over to the pycudadecon repo...