devsisters / DQN-tensorflow

Tensorflow implementation of Human-Level Control through Deep Reinforcement Learning
MIT License
2.48k stars 764 forks source link

CUDA_ERROR_OUT_OF_MEMOR #49

Open vulgatecnn opened 6 years ago

vulgatecnn commented 6 years ago

name: GeForce GTX 1070 major: 6 minor: 1 memoryClockRate(GHz): 1.7465 pciBusID: 0000:01:00.0 totalMemory: 8.00GiB freeMemory: 6.63GiB 2018-04-09 16:56:08.725121: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1423] Adding visible gpu devices: 0 2018-04-09 16:56:09.233091: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:911] Device interconnect StreamExecutor with strength 1 edge matrix: 2018-04-09 16:56:09.236868: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:917] 0 2018-04-09 16:56:09.239291: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:930] 0: N 2018-04-09 16:56:09.242389: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1041] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 8192 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1070, pci bus id: 0000:01:00.0, compute capability: 6.1) 2018-04-09 16:56:09.249227: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 8.00G (8589934592 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY 2018-04-09 16:56:09.253673: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 7.20G (7730940928 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY Traceback (most recent call last): File "main.py", line 75, in tf.app.run() File "D:\ProgramData\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\platform\app.py", line 126, in run _sys.exit(main(argv)) File "main.py", line 53, in main config = get_config(FLAGS) or FLAGS File "D:\tools\dqq\DQN-tensorflow-master\config.py", line 58, in get_config

darioromero commented 6 years ago

I am getting something similar when testing YOLOv2 with a small video only 43sec. Load | Yep! | local flatten 2x2 | (?, 19, 19, 256) Load | Yep! | concat [27, 24] | (?, 19, 19, 1280) Load | Yep! | conv 3x3p1_1 +bnorm leaky | (?, 19, 19, 1024) Load | Yep! | conv 1x1p0_1 linear | (?, 19, 19, 425) -------+--------+----------------------------------+--------------- GPU mode with 1.0 usage 2018-07-05 11:48:02.192435: I T:\src\github\tensorflow\tensorflow\core\platform\cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 2018-07-05 11:48:02.395594: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1356] Found device 0 with properties: name: GeForce GTX 1070 major: 6 minor: 1 memoryClockRate(GHz): 1.645 pciBusID: 0000:01:00.0 totalMemory: 8.00GiB freeMemory: 6.62GiB 2018-07-05 11:48:02.401522: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1435] Adding visible gpu devices: 0 2018-07-05 11:48:03.028882: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:923] Device interconnect StreamExecutor with strength 1 edge matrix: 2018-07-05 11:48:03.031903: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:929] 0 2018-07-05 11:48:03.033596: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:942] 0: N 2018-07-05 11:48:03.035288: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1053] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 8192 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1070, pci bus id: 0000:01:00.0, compute capability: 6.1) 2018-07-05 11:48:03.041835: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 8.00G (8589934592 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY 2018-07-05 11:48:03.044590: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 7.20G (7730940928 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY Finished in 10.13445782661438s

Press [ESC] to quit demo 2018-07-05 11:48:10.770607: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_dnn.cc:455] could not create cudnn handle: CUDNN_STATUS_ALLOC_FAILED 2018-07-05 11:48:10.778007: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_dnn.cc:427] could not destroy cudnn handle: CUDNN_STATUS_BAD_PARAM 2018-07-05 11:48:10.780970: F T:\src\github\tensorflow\tensorflow\core\kernels\conv_ops.cc:713] Check failed: stream->parent()->GetConvolveAlgorithms( conv_parameters.ShouldIncludeWinogradNonfusedAlgo(), &algorithms)