Open yarikoptic opened 5 years ago
instead of this:
singularity run -e -B /usr/lib/x86_64-linux-gnu/libnvidia-fatbinaryloader.so.430.50 \
-B /usr/lib/x86_64-linux-gnu/libcuda.so.1 neuronets-kwyk--version-0.4-gpu.sing \
raiders/sub-rid000005/anat/sub-rid000005_run-01_T1w.nii.gz out
can you try:
singularity run -e --nv neuronets-kwyk--version-0.4-gpu.sing \
raiders/sub-rid000005/anat/sub-rid000005_run-01_T1w.nii.gz out
with --nv it used to halt, now (there is a bit more of free memory) it proceeds to the same crash.
I found http://tuxvoid.blogspot.com/2017/08/tensorflow-could-not-create-cudnn.html referenced from
https://github.com/tensorflow/tensorflow/issues/14048 suggesting that instructing tensor flow to allow_grouth
config = tf.ConfigProto()
config.gpu_options.allow_growth = True
sess = tf.Session(config=config)
might help, but I could not figure out where in kwyk or nobrainer to tune that.
It used to work on my laptop, but no longer. I fear it is due to some interaction with GPU being used as an actual graphics card as well, and thus Xorg consuming too much memory (but requested ~1.3GB is less than available free ~2GB) or something like that
nvidia-smi
```shell $> nvidia-smi Mon Nov 11 09:55:21 2019 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 430.50 Driver Version: 430.50 CUDA Version: 10.1 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |===============================+======================+======================| | 0 Quadro T2000 Off | 00000000:01:00.0 Off | N/A | | N/A 43C P8 3W / N/A | 2297MiB / 3911MiB | 19% Default | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: GPU Memory | | GPU PID Type Process name Usage | |=============================================================================| | 0 21824 G /usr/lib/xorg/Xorg 141MiB | | 0 25467 G /usr/lib/xorg/Xorg 1670MiB | | 0 25596 G /usr/bin/gnome-shell 180MiB | | 0 27333 G ...uest-channel-token=14439694130078186709 232MiB | | 0 28802 G /usr/lib/xorg/Xorg 6MiB | | 0 28899 G /usr/bin/gnome-shell 5MiB | +-----------------------------------------------------------------------------+ ```the actual run via singularity
```shell $> singularity run -e -B /usr/lib/x86_64-linux-gnu/libnvidia-fatbinaryloader.so.430.50 -B /usr/lib/x86_64-linux-gnu/libcuda.so.1 neuronets-kwyk--version-0.4-gpu.sing raiders/sub-rid000005/anat/sub-rid000005_run-01_T1w.nii.gz out Bayesian dropout functions have been loaded. Your version: v0.4 Latest version: 0.4 ++ Conforming volume to 1mm^3 voxels and size 256x256x256. /opt/kwyk/freesurfer/bin/mri_convert: line 2: /opt/kwyk/freesurfer/sources.sh: No such file or directory mri_convert.bin --conform raiders/sub-rid000005/anat/sub-rid000005_run-01_T1w.nii.gz /tmp/tmpwtickiw9.nii.gz $Id: mri_convert.c,v 1.226 2016/02/26 16:15:24 mreuter Exp $ reading from raiders/sub-rid000005/anat/sub-rid000005_run-01_T1w.nii.gz... TR=10.00, TE=0.00, TI=0.00, flip angle=0.00 i_ras = (0, -1, 0) j_ras = (0, 0, 1) k_ras = (1, 0, 0) changing data type from float to uchar (noscale = 0)... MRIchangeType: Building histogram Reslicing using trilinear interpolation writing to /tmp/tmpwtickiw9.nii.gz... ++ Running forward pass of model. 2019-11-11 14:57:43.820728: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA 2019-11-11 14:57:43.916219: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:964] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2019-11-11 14:57:43.916394: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1432] Found device 0 with properties: name: Quadro T2000 major: 7 minor: 5 memoryClockRate(GHz): 1.5 pciBusID: 0000:01:00.0 totalMemory: 3.82GiB freeMemory: 1.41GiB 2019-11-11 14:57:43.916409: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1511] Adding visible gpu devices: 0 2019-11-11 14:57:44.267550: I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] Device interconnect StreamExecutor with strength 1 edge matrix: 2019-11-11 14:57:44.267570: I tensorflow/core/common_runtime/gpu/gpu_device.cc:988] 0 2019-11-11 14:57:44.267575: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 0: N 2019-11-11 14:57:44.267684: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 1246 MB memory) -> physical GPU (device: 0, name: Quadro T2000, pci bus id: 0000:01:00.0, compute capability: 7.5) Normalizer being used