nejyeah / DeepPicker-python

This is the source code for the paper DeepPicker.
31 stars 21 forks source link

docker image #2

Open thorstenwagner opened 6 years ago

thorstenwagner commented 6 years ago

Hi,

I've created an docker image for easy testing on our GPU machines and I'm happy to share with you: https://github.com/thorstenwagner/docker-deeppicker https://hub.docker.com/r/thorstenwagner/docker-deeppicker/

However, if I try to run your picker I get the following error:

python autoPick.py --inputDir '/pre_calc/images/corrsum/' --pre_trained_model './trained_model/model_demo_type3' --mrc_number 10 --particle_size 256 --outputDir '/pre_calc/results' --coordinate_symbol '_cnnPick' --threshold 0.5 
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcuda.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcurand.so locally
I tensorflow/core/common_runtime/gpu/gpu_init.cc:102] Found device 0 with properties: 
name: GeForce GTX 1080
major: 6 minor: 1 memoryClockRate (GHz) 1.7335
pciBusID 0000:01:00.0
Total memory: 7.92GiB
Free memory: 7.80GiB
W tensorflow/stream_executor/cuda/cuda_driver.cc:572] creating context when one is currently active; existing: 0x3173c40
I tensorflow/core/common_runtime/gpu/gpu_init.cc:102] Found device 1 with properties: 
name: GeForce GTX 1080
major: 6 minor: 1 memoryClockRate (GHz) 1.7335
pciBusID 0000:02:00.0
Total memory: 7.92GiB
Free memory: 7.80GiB
W tensorflow/stream_executor/cuda/cuda_driver.cc:572] creating context when one is currently active; existing: 0x380e010
I tensorflow/core/common_runtime/gpu/gpu_init.cc:102] Found device 2 with properties: 
name: GeForce GTX 1080
major: 6 minor: 1 memoryClockRate (GHz) 1.7335
pciBusID 0000:03:00.0
Total memory: 7.92GiB
Free memory: 7.80GiB
W tensorflow/stream_executor/cuda/cuda_driver.cc:572] creating context when one is currently active; existing: 0x3ea87d0
I tensorflow/core/common_runtime/gpu/gpu_init.cc:102] Found device 3 with properties: 
name: GeForce GTX 1080
major: 6 minor: 1 memoryClockRate (GHz) 1.7335
pciBusID 0000:04:00.0
Total memory: 7.92GiB
Free memory: 7.74GiB
I tensorflow/core/common_runtime/gpu/gpu_init.cc:126] DMA: 0 1 2 3 
I tensorflow/core/common_runtime/gpu/gpu_init.cc:136] 0:   Y Y Y Y 
I tensorflow/core/common_runtime/gpu/gpu_init.cc:136] 1:   Y Y Y Y 
I tensorflow/core/common_runtime/gpu/gpu_init.cc:136] 2:   Y Y Y Y 
I tensorflow/core/common_runtime/gpu/gpu_init.cc:136] 3:   Y Y Y Y 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:806] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 1080, pci bus id: 0000:01:00.0)
I tensorflow/core/common_runtime/gpu/gpu_device.cc:806] Creating TensorFlow device (/gpu:1) -> (device: 1, name: GeForce GTX 1080, pci bus id: 0000:02:00.0)
I tensorflow/core/common_runtime/gpu/gpu_device.cc:806] Creating TensorFlow device (/gpu:2) -> (device: 2, name: GeForce GTX 1080, pci bus id: 0000:03:00.0)
I tensorflow/core/common_runtime/gpu/gpu_device.cc:806] Creating TensorFlow device (/gpu:3) -> (device: 3, name: GeForce GTX 1080, pci bus id: 0000:04:00.0)
/pre_calc/images/corrsum/TcdA1-0001_frames_sum.mrc
E tensorflow/stream_executor/cuda/cuda_dnn.cc:346] **Loaded cudnn library: 4008 but source was compiled against 4007.  If using a binary install, upgrade your cudnn library to match.  If building from sources, make sure the library loaded matches the version you specified during compile configuration.**
F tensorflow/core/kernels/conv_ops.cc:459] Check failed: stream->parent()->GetConvolveAlgorithms(&algorithms) 
Aborted (core dumped)

I'm using cuda toolkit 7.5 and cudnn v4.

How can I recompile it?

Best, Thorsten

nejyeah commented 6 years ago

Thanks for helping to construct a docker for DeepPicker. I will try to figure out the problems. As far as I can see, the project needs to be revised for the new TF version, it may take some time to finish it.

thorstenwagner commented 6 years ago

As a first step, it would be enough if you explain me how to recompile it, than I would get a running version quickly.

nejyeah commented 6 years ago

The project should have no need to compile. The error seems that the cudnn lib you use is not match the TF version or something else.

thorstenwagner commented 6 years ago

OK! I've opened an issue at nvidia-docker repo. Let's see what they say. https://github.com/NVIDIA/nvidia-docker/issues/564

thorstenwagner commented 6 years ago

Changed version to https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-0.8.0-cp27-none-linux_x86_64.whl

its is picking now!

thorstenwagner commented 6 years ago

More important information regarding this issue: https://github.com/NVIDIA/nvidia-docker/issues/564#issuecomment-350050031