Closed talmazov closed 4 years ago
Did you find any solution for this issue?
Did you find any solution for this issue?
no solution. I suspect this is an issue with the underlying code not being able to copy data between system RAM and VRAM but im not sure why - i see the bug was assigned to someone. for now either drastically decrease the DICOM resolution or increase you graphics card's available VRAM. Also to be fair, processing CT/CBCT DICOM even on CPU (which does not produce this error), my PC maxes out 32gigs of RAM easily
Just Change the Tensorflow version! It dose work !
you can follow the following link to solve this issue! https://github.com/NifTK/NiftyNet/issues/447
Hey, so i tried again, when i run tensorflow GPU for object detection everything runs fine. I have installed numpy 1.16.0 amd tensorflow-gpu 1.13.2 however i still get the GPU-CPU memcpy failed
i am not sure why CUDNN handle could not be created
2019-12-06 22:01:17.891564: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu devices: 0
2019-12-06 22:01:17.891610: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-12-06 22:01:17.891614: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990] 0
2019-12-06 22:01:17.891617: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0: N
2019-12-06 22:01:17.891694: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 4853 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2060, pci bus id: 0000:01:00.0, compute capability: 7.5)
INFO:niftynet: Parameters from random initialisations ...
2019-12-06 22:01:24.491935: I tensorflow/stream_executor/dso_loader.cc:152] successfully opened CUDA library libcublas.so.10.0 locally
2019-12-06 22:01:24.939564: I tensorflow/core/kernels/cuda_solvers.cc:159] Creating CudaSolver handles for stream 0x4ae44b0
2019-12-06 22:01:35.242771: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:101] Filling up shuffle buffer (this may take a while): 12 of 30
2019-12-06 22:01:45.241902: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:101] Filling up shuffle buffer (this may take a while): 24 of 30
2019-12-06 22:01:50.185805: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:140] Shuffle buffer filled.
2019-12-06 22:01:55.576205: E tensorflow/stream_executor/cuda/cuda_dnn.cc:334] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2019-12-06 22:01:55.588900: E tensorflow/stream_executor/cuda/cuda_dnn.cc:334] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2019-12-06 22:01:55.600046: E tensorflow/stream_executor/cuda/cuda_dnn.cc:334] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2019-12-06 22:01:55.600087: W ./tensorflow/stream_executor/stream.h:2099] attempting to perform DNN operation using StreamExecutor without DNN support
INFO:niftynet: cleaning up...
INFO:niftynet: stopping sampling threads
2019-12-06 22:01:56.241818: I tensorflow/stream_executor/stream.cc:2079] [stream=0x4aeaa00,impl=0x4aeaaa0] did not wait for [stream=0x4aea520,impl=0x4ae44d0]
2019-12-06 22:01:56.241845: I tensorflow/stream_executor/stream.cc:5014] [stream=0x4aeaa00,impl=0x4aeaaa0] did not memcpy device-to-host; source: 0x7fc710cc5b00
2019-12-06 22:01:56.241879: I tensorflow/stream_executor/stream.cc:2079] [stream=0x4aeaa00,impl=0x4aeaaa0] did not wait for [stream=0x4aea520,impl=0x4ae44d0]
2019-12-06 22:01:56.241921: I tensorflow/stream_executor/stream.cc:5014] [stream=0x4aeaa00,impl=0x4aeaaa0] did not memcpy device-to-host; source: 0x7fc710cc5c00
2019-12-06 22:01:56.241933: I tensorflow/stream_executor/stream.cc:2079] [stream=0x4aeaa00,impl=0x4aeaaa0] did not wait for [stream=0x4aea520,impl=0x4ae44d0]
2019-12-06 22:01:56.241938: I tensorflow/stream_executor/stream.cc:5014] [stream=0x4aeaa00,impl=0x4aeaaa0] did not memcpy device-to-host; source: 0x7fc710cc5a00
2019-12-06 22:01:56.241949: I tensorflow/stream_executor/stream.cc:2079] [stream=0x4aeaa00,impl=0x4aeaaa0] did not wait for [stream=0x4aea520,impl=0x4ae44d0]
2019-12-06 22:01:56.241948: F tensorflow/core/common_runtime/gpu/gpu_util.cc:292] GPU->CPU Memcpy failed
2019-12-06 22:01:56.241964: I tensorflow/stream_executor/stream.cc:5014] [stream=0x4aeaa00,impl=0x4aeaaa0] did not memcpy device-to-host; source: 0x7fc710cc5e00
Aborted
I tried
sudo python3 net_download.py dense_vnet_abdominal_ct_model_zoo
sudo python3 net_segment.py inference -c ~/niftynet/extensions/dense_vnet_abdominal_ct/config.ini
and i get
2019-12-07 11:37:02.795645: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 4904 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2060, pci bus id: 0000:01:00.0, compute capability: 7.5)
INFO:niftynet: Restoring parameters from /home/mayotic/niftynet/models/dense_vnet_abdominal_ct/models/model.ckpt-3000
2019-12-07 11:37:06.140918: E tensorflow/stream_executor/cuda/cuda_dnn.cc:334] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2019-12-07 11:37:06.143425: E tensorflow/stream_executor/cuda/cuda_dnn.cc:334] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2019-12-07 11:37:06.148064: E tensorflow/stream_executor/cuda/cuda_dnn.cc:334] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2019-12-07 11:37:06.148081: W ./tensorflow/stream_executor/stream.h:2099] attempting to perform DNN operation using StreamExecutor without DNN support
INFO:niftynet: cleaning up...
INFO:niftynet: stopping sampling threads
what version of cuDNN is everybody else running?? i have niftynet 0.6, CUDA 10.0, tensorflow-gpu 1.13.2 and numpy 1.16 using geforce RTX 2060 6GB vram with nvidia driver 440.33.01 tensorflow tries to allocate 5 GB spatial_window_size = (64, 64, 512) with dense_vnet network
is this a common error thrown when the GPU does not have enough physical memory to run training?
Hey everyone, I am trying to run a training on 2 CBCT segmented volumes but I am running into an issue of GPU->CPU Memcpy failed. I was reading around in the tensorflow repo that it is an issue of insufficient memory so i reduced the size of the volumes drastically, as well as the batch_size and queue_length however I still get this error. I have included the configuration file here. My PC runs on 32 GB of RAM and GeForce RTX2060 w/ 6GB VRAM. NiftyNet and TensorFlow recognize the device. I have CUDA 10 installed on the nvidia driver 418 w/ cuDNN 7.6.3 When i run in the python console sess = tf.Session(config=tf.ConfigProto(log_device_placement=True)) the GPU appears and everything is fine
Most notably I see "could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR" then when niftynet performs a Parameters from random initialisations where the shuffle buffer is filled, it gives the GPU->CPU memcpy failed error 2019-08-24 17:21:52.622248: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:162] Shuffle buffer filled.
I do not get this issue when performing training on the CPU.
I switched the networks from dense_vnet to highres3dnet and at this point I only get the "Could not create cudnn handle" error. I modified the util_common.py tf_config() method to include config.gpu_options.allow_growth = True as described in https://github.com/tensorflow/tensorflow/issues/24496 but that does not seem to address the issue.
Any thoughts? Is this an issue of not enough VRAM?
the command i use to run is python3 net_segment.py train -c ~/mandible_segmentation/config.ini
My configuration is
running the following code from within python3 cli works just fine