allanzelener / YAD2K

YAD2K: Yet Another Darknet 2 Keras
Other
2.71k stars 879 forks source link

Resource Exhausted Error while converting Darknet model in keras model. #171

Closed kankratekaran closed 5 years ago

kankratekaran commented 5 years ago

The log is shown below for more concrete information. Please look carefully at sections which are bold

I tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 1 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2080 Ti, pci bus id: 0000:01:00.0, compute capability: 7.5) Parsing section maxpool_0 Parsing section convolutional_1 conv2d bn leaky (3, 3, 32, 64) Parsing section maxpool_1 Parsing section convolutional_2 conv2d bn leaky (3, 3, 64, 128) Parsing section convolutional_3 conv2d bn leaky (1, 1, 128, 64) Parsing section convolutional_4 conv2d bn leaky (3, 3, 64, 128) Parsing section maxpool_2 Parsing section convolutional_5 conv2d bn leaky (3, 3, 128, 256) 2019-08-02 15:59:13.945489: W tensorflow/core/common_runtime/bfc_allocator.cc:314] Allocator (GPU_0_bfc) ran out of memory trying to allocate 1.12MiB (rounded to 1179648). Current allocation summary follows. 2019-08-02 15:59:13.945599: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (256): Total Chunks: 36, Chunks in use: 36. 9.0KiB allocated for chunks. 9.0KiB in use in bin. 5.0KiB client-requested in use in bin. 2019-08-02 15:59:13.945657: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (512): Total Chunks: 16, Chunks in use: 16. 8.0KiB allocated for chunks. 8.0KiB in use in bin. 8.0KiB client-requested in use in bin. 2019-08-02 15:59:13.945732: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (1024): Total Chunks: 1, Chunks in use: 1. 1.2KiB allocated for chunks. 1.2KiB in use in bin. 1.0KiB client-requested in use in bin. 2019-08-02 15:59:13.945782: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (2048): Total Chunks: 1, Chunks in use: 1. 3.5KiB allocated for chunks. 3.5KiB in use in bin. 3.4KiB client-requested in use in bin. 2019-08-02 15:59:13.945824: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (4096): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin. 2019-08-02 15:59:13.945869: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (8192): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin. 2019-08-02 15:59:13.945916: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (16384): Total Chunks: 1, Chunks in use: 0. 25.0KiB allocated for chunks. 0B in use in bin. 0B client-requested in use in bin. 2019-08-02 15:59:13.945969: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (32768): Total Chunks: 1, Chunks in use: 1. 34.0KiB allocated for chunks. 34.0KiB in use in bin. 32.0KiB client-requested in use in bin. 2019-08-02 15:59:13.946022: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (65536): Total Chunks: 1, Chunks in use: 1. 72.0KiB allocated for chunks. 72.0KiB in use in bin. 72.0KiB client-requested in use in bin. 2019-08-02 15:59:13.946067: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (131072): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin. 2019-08-02 15:59:13.946103: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (262144): Total Chunks: 3, Chunks in use: 2. 864.0KiB allocated for chunks. 576.0KiB in use in bin. 576.0KiB client-requested in use in bin. 2019-08-02 15:59:13.946147: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (524288): Total Chunks: 1, Chunks in use: 0. 839.2KiB allocated for chunks. 0B in use in bin. 0B client-requested in use in bin. 2019-08-02 15:59:13.946191: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (1048576): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin. 2019-08-02 15:59:13.946244: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (2097152): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin. 2019-08-02 15:59:13.946288: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (4194304): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin. 2019-08-02 15:59:13.946319: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (8388608): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin. 2019-08-02 15:59:13.946353: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (16777216): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin. 2019-08-02 15:59:13.946396: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (33554432): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin. 2019-08-02 15:59:13.946424: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (67108864): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin. 2019-08-02 15:59:13.946457: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (134217728): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin. 2019-08-02 15:59:13.946503: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (268435456): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin. 2019-08-02 15:59:13.946549: I tensorflow/core/common_runtime/bfc_allocator.cc:780] Bin for 1.12MiB was 1.00MiB, Chunk State: 2019-08-02 15:59:13.946590: I tensorflow/core/common_runtime/bfc_allocator.cc:793] Next region of size 1900544 2019-08-02 15:59:13.946623: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0x7f9d86e00000 next 1 of size 1280 2019-08-02 15:59:13.946653: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0x7f9d86e00500 next 2 of size 256 2019-08-02 15:59:13.946690: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0x7f9d86e00600 next 3 of size 256 2019-08-02 15:59:13.946726: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0x7f9d86e00700 next 6 of size 256 2019-08-02 15:59:13.946762: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0x7f9d86e00800 next 7 of size 256 2019-08-02 15:59:13.946801: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0x7f9d86e00900 next 8 of size 256 2019-08-02 15:59:13.946838: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0x7f9d86e00a00 next 9 of size 256 2019-08-02 15:59:13.946875: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0x7f9d86e00b00 next 10 of size 256 2019-08-02 15:59:13.946913: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0x7f9d86e00c00 next 11 of size 256 2019-08-02 15:59:13.946952: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0x7f9d86e00d00 next 12 of size 256 2019-08-02 15:59:13.946989: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0x7f9d86e00e00 next 13 of size 256 2019-08-02 15:59:13.947026: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0x7f9d86e00f00 next 14 of size 256 2019-08-02 15:59:13.947064: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0x7f9d86e01000 next 15 of size 256 2019-08-02 15:59:13.947102: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0x7f9d86e01100 next 18 of size 256 2019-08-02 15:59:13.947138: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0x7f9d86e01200 next 19 of size 256 2019-08-02 15:59:13.947177: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0x7f9d86e01300 next 20 of size 256 2019-08-02 15:59:13.947215: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0x7f9d86e01400 next 4 of size 256 2019-08-02 15:59:13.947253: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0x7f9d86e01500 next 5 of size 3584 2019-08-02 15:59:13.947291: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0x7f9d86e02300 next 21 of size 256 2019-08-02 15:59:13.947327: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0x7f9d86e02400 next 22 of size 256 2019-08-02 15:59:13.947364: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0x7f9d86e02500 next 23 of size 256 2019-08-02 15:59:13.947402: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0x7f9d86e02600 next 24 of size 256 2019-08-02 15:59:13.947439: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0x7f9d86e02700 next 25 of size 256 2019-08-02 15:59:13.947475: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0x7f9d86e02800 next 26 of size 256 2019-08-02 15:59:13.947515: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0x7f9d86e02900 next 29 of size 512 2019-08-02 15:59:13.947551: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0x7f9d86e02b00 next 30 of size 512 2019-08-02 15:59:13.947589: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0x7f9d86e02d00 next 31 of size 512 2019-08-02 15:59:13.947625: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0x7f9d86e02f00 next 32 of size 512 2019-08-02 15:59:13.947660: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0x7f9d86e03100 next 33 of size 512 2019-08-02 15:59:13.947696: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0x7f9d86e03300 next 34 of size 512 2019-08-02 15:59:13.947731: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0x7f9d86e03500 next 35 of size 512 2019-08-02 15:59:13.947753: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0x7f9d86e03700 next 36 of size 512 2019-08-02 15:59:13.947770: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0x7f9d86e03900 next 37 of size 256 2019-08-02 15:59:13.947796: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0x7f9d86e03a00 next 38 of size 256 2019-08-02 15:59:13.947829: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0x7f9d86e03b00 next 40 of size 256 2019-08-02 15:59:13.947855: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0x7f9d86e03c00 next 41 of size 256 2019-08-02 15:59:13.947877: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0x7f9d86e03d00 next 42 of size 256 2019-08-02 15:59:13.947910: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0x7f9d86e03e00 next 43 of size 256 2019-08-02 15:59:13.947946: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0x7f9d86e03f00 next 44 of size 256 2019-08-02 15:59:13.947982: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0x7f9d86e04000 next 45 of size 256 2019-08-02 15:59:13.948019: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0x7f9d86e04100 next 46 of size 256 2019-08-02 15:59:13.948056: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0x7f9d86e04200 next 47 of size 256 2019-08-02 15:59:13.948086: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0x7f9d86e04300 next 48 of size 256 2019-08-02 15:59:13.948118: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0x7f9d86e04400 next 49 of size 256 2019-08-02 15:59:13.948146: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0x7f9d86e04500 next 51 of size 512 2019-08-02 15:59:13.948168: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0x7f9d86e04700 next 52 of size 512 2019-08-02 15:59:13.948201: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0x7f9d86e04900 next 53 of size 512 2019-08-02 15:59:13.948232: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0x7f9d86e04b00 next 54 of size 512 2019-08-02 15:59:13.948261: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0x7f9d86e04d00 next 55 of size 512 2019-08-02 15:59:13.948293: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0x7f9d86e04f00 next 56 of size 512 2019-08-02 15:59:13.948316: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0x7f9d86e05100 next 57 of size 512 2019-08-02 15:59:13.948333: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0x7f9d86e05300 next 58 of size 512 2019-08-02 15:59:13.948361: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0x7f9d86e05500 next 59 of size 256 2019-08-02 15:59:13.948396: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0x7f9d86e05600 next 60 of size 256 2019-08-02 15:59:13.948433: I tensorflow/core/common_runtime/bfc_allocator.cc:800] Free at 0x7f9d86e05700 next 39 of size 25600 2019-08-02 15:59:13.948471: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0x7f9d86e0bb00 next 16 of size 34816 2019-08-02 15:59:13.948509: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0x7f9d86e14300 next 17 of size 73728 2019-08-02 15:59:13.948532: I tensorflow/core/common_runtime/bfc_allocator.cc:800] Free at 0x7f9d86e26300 next 27 of size 294912 2019-08-02 15:59:13.948553: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0x7f9d86e6e300 next 28 of size 294912 2019-08-02 15:59:13.948584: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0x7f9d86eb6300 next 50 of size 294912 2019-08-02 15:59:13.948620: I tensorflow/core/common_runtime/bfc_allocator.cc:800] Free at 0x7f9d86efe300 next 18446744073709551615 of size 859392 2019-08-02 15:59:13.948654: I tensorflow/core/common_runtime/bfc_allocator.cc:809] Summary of in-use Chunks by size: 2019-08-02 15:59:13.948686: I tensorflow/core/common_runtime/bfc_allocator.cc:812] 36 Chunks of size 256 totalling 9.0KiB 2019-08-02 15:59:13.948741: I tensorflow/core/common_runtime/bfc_allocator.cc:812] 16 Chunks of size 512 totalling 8.0KiB 2019-08-02 15:59:13.948788: I tensorflow/core/common_runtime/bfc_allocator.cc:812] 1 Chunks of size 1280 totalling 1.2KiB 2019-08-02 15:59:13.948827: I tensorflow/core/common_runtime/bfc_allocator.cc:812] 1 Chunks of size 3584 totalling 3.5KiB 2019-08-02 15:59:13.948867: I tensorflow/core/common_runtime/bfc_allocator.cc:812] 1 Chunks of size 34816 totalling 34.0KiB 2019-08-02 15:59:13.948907: I tensorflow/core/common_runtime/bfc_allocator.cc:812] 1 Chunks of size 73728 totalling 72.0KiB 2019-08-02 15:59:13.948948: I tensorflow/core/common_runtime/bfc_allocator.cc:812] 2 Chunks of size 294912 totalling 576.0KiB 2019-08-02 15:59:13.948989: I tensorflow/core/common_runtime/bfc_allocator.cc:816] Sum Total of in-use chunks: 703.8KiB 2019-08-02 15:59:13.949030: I tensorflow/core/common_runtime/bfc_allocator.cc:818] total_region_allocatedbytes: 1900544 memorylimit: 1900544 available bytes: 0 curr_region_allocationbytes: 3801088 2019-08-02 15:59:13.949078: I tensorflow/core/common_runtime/bfc_allocator.cc:824] Stats: Limit: 1900544 InUse: 720640 MaxInUse: 1010944 NumAllocs: 63 MaxAllocSize: 294912

2019-08-02 15:59:13.949152: W tensorflow/core/common_runtime/bfc_allocator.cc:319] *****__****_____ 2019-08-02 15:59:13.951348: W tensorflow/core/framework/op_kernel.cc:1502] OP_REQUIRES failed at random_op.cc:76 : Resource exhausted: OOM when allocating tensor with shape[3,3,128,256] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc Traceback (most recent call last): File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1356, in _do_call return fn(*args) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1341, in _run_fn options, feed_dict, fetch_list, target_list, run_metadata) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1429, in _call_tf_sessionrun run_metadata) tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[3,3,128,256] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc [[{{node conv2d_5/kernel/Initializer/random_uniform/RandomUniform}}]] Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "yad2k.py", line 270, in _main(parser.parse_args()) File "yad2k.py", line 186, in _main padding=padding))(prev_layer) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/base_layer.py", line 688, in call self.set_weights(self._initial_weights) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/base_layer.py", line 1114, in set_weights param_values = backend.batch_get_value(params) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/backend.py", line 3010, in batch_get_value return get_session(tensors).run(tensors) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/backend.py", line 462, in get_session _initialize_variables(session) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/backend.py", line 886, in _initialize_variables session.run(variables_module.variables_initializer(uninitialized_vars)) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 950, in run run_metadata_ptr) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1173, in _run feed_dict_tensor, options, run_metadata) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1350, in _do_run run_metadata) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1370, in _do_call raise type(e)(node_def, op, message) tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[3,3,128,256] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc [[node conv2d_5/kernel/Initializer/random_uniform/RandomUniform (defined at yad2k.py:186) ]] Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.