mlcommons / tiny

MLPerf™ Tiny is an ML benchmark suite for extremely low-power systems such as microcontrollers
https://mlcommons.org/en/groups/inference-tiny/
Apache License 2.0
331 stars 81 forks source link

Cannot get anomaly_detection to run, memory issues #130

Open alxhoff opened 2 years ago

alxhoff commented 2 years ago

Hi all,

I am having problems running the 00_train.py script in the anomaly_detection benchmark, I am unsure what the cause is as regardless of the batch size I use the issue persists.

I found that by reducing the size of the train_data training data input to a length of around ~700000 items (train_data[:700000]) then it ran no problem. The weird thing is that my system, which sadly only has 32GB of RAM doesn't get close to running out of system memory when watching free -m. The first epoch will train until it is essentially finished then the error is thrown when moving to the second epoch the following error is thrown. I will keep googling but am hoping someone here has an idea of where the problem could be coming from

2022-06-01 14:46:55.918879: W tensorflow/core/framework/cpu_allocator_impl.cc:82] Allocation of 1806336000 exceeds 10% of free system memory.
2022-06-01 14:46:56.979387: W tensorflow/core/framework/cpu_allocator_impl.cc:82] Allocation of 1806336000 exceeds 10% of free system memory.
2022-06-01 14:46:57.674953: W tensorflow/core/framework/cpu_allocator_impl.cc:82] Allocation of 1806336000 exceeds 10% of free system memory.
2022-06-01 14:46:58.187186: W tensorflow/core/framework/cpu_allocator_impl.cc:82] Allocation of 1806336000 exceeds 10% of free system memory.
Epoch 1/100
1374/1379 [============================>.] - ETA: 0s - loss: 96.14442022-06-01 14:47:24.023059: W tensorflow/core/common_runtime/bfc_allocator.cc:462] Allocator (GPU_0_bfc) ran out of memory trying to allocate 191.41MiB (rounded to 200704000)requested by op _EagerConst
If the cause is memory fragmentation maybe the environment variable 'TF_GPU_ALLOCATOR=cuda_malloc_async' will improve the situation. 
Current allocation summary follows.
Current allocation summary follows.
2022-06-01 14:47:24.023086: I tensorflow/core/common_runtime/bfc_allocator.cc:1010] BFCAllocator dump for GPU_0_bfc
2022-06-01 14:47:24.023100: I tensorflow/core/common_runtime/bfc_allocator.cc:1017] Bin (256):  Total Chunks: 50, Chunks in use: 50. 12.5KiB allocated for chunks. 12.5KiB in use in bin. 544B client-requested in use in bin.
2022-06-01 14:47:24.023110: I tensorflow/core/common_runtime/bfc_allocator.cc:1017] Bin (512):  Total Chunks: 88, Chunks in use: 88. 44.2KiB allocated for chunks. 44.2KiB in use in bin. 44.0KiB client-requested in use in bin.
2022-06-01 14:47:24.023118: I tensorflow/core/common_runtime/bfc_allocator.cc:1017] Bin (1024):     Total Chunks: 1, Chunks in use: 1. 1.2KiB allocated for chunks. 1.2KiB in use in bin. 1.0KiB client-requested in use in bin.
2022-06-01 14:47:24.023126: I tensorflow/core/common_runtime/bfc_allocator.cc:1017] Bin (2048):     Total Chunks: 3, Chunks in use: 3. 7.5KiB allocated for chunks. 7.5KiB in use in bin. 7.5KiB client-requested in use in bin.
2022-06-01 14:47:24.023135: I tensorflow/core/common_runtime/bfc_allocator.cc:1017] Bin (4096):     Total Chunks: 6, Chunks in use: 6. 24.5KiB allocated for chunks. 24.5KiB in use in bin. 24.0KiB client-requested in use in bin.
2022-06-01 14:47:24.023143: I tensorflow/core/common_runtime/bfc_allocator.cc:1017] Bin (8192):     Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2022-06-01 14:47:24.023149: I tensorflow/core/common_runtime/bfc_allocator.cc:1017] Bin (16384):    Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2022-06-01 14:47:24.023156: I tensorflow/core/common_runtime/bfc_allocator.cc:1017] Bin (32768):    Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2022-06-01 14:47:24.023164: I tensorflow/core/common_runtime/bfc_allocator.cc:1017] Bin (65536):    Total Chunks: 18, Chunks in use: 18. 1.24MiB allocated for chunks. 1.24MiB in use in bin. 1.12MiB client-requested in use in bin.
2022-06-01 14:47:24.023171: I tensorflow/core/common_runtime/bfc_allocator.cc:1017] Bin (131072):   Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2022-06-01 14:47:24.023179: I tensorflow/core/common_runtime/bfc_allocator.cc:1017] Bin (262144):   Total Chunks: 7, Chunks in use: 6. 2.19MiB allocated for chunks. 1.88MiB in use in bin. 1.88MiB client-requested in use in bin.
2022-06-01 14:47:24.023186: I tensorflow/core/common_runtime/bfc_allocator.cc:1017] Bin (524288):   Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2022-06-01 14:47:24.023193: I tensorflow/core/common_runtime/bfc_allocator.cc:1017] Bin (1048576):  Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2022-06-01 14:47:24.023200: I tensorflow/core/common_runtime/bfc_allocator.cc:1017] Bin (2097152):  Total Chunks: 1, Chunks in use: 0. 2.44MiB allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2022-06-01 14:47:24.023207: I tensorflow/core/common_runtime/bfc_allocator.cc:1017] Bin (4194304):  Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2022-06-01 14:47:24.023214: I tensorflow/core/common_runtime/bfc_allocator.cc:1017] Bin (8388608):  Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2022-06-01 14:47:24.023221: I tensorflow/core/common_runtime/bfc_allocator.cc:1017] Bin (16777216):     Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2022-06-01 14:47:24.023230: I tensorflow/core/common_runtime/bfc_allocator.cc:1017] Bin (33554432):     Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2022-06-01 14:47:24.023237: I tensorflow/core/common_runtime/bfc_allocator.cc:1017] Bin (67108864):     Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2022-06-01 14:47:24.023245: I tensorflow/core/common_runtime/bfc_allocator.cc:1017] Bin (134217728):    Total Chunks: 1, Chunks in use: 0. 181.04MiB allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2022-06-01 14:47:24.023253: I tensorflow/core/common_runtime/bfc_allocator.cc:1017] Bin (268435456):    Total Chunks: 2, Chunks in use: 2. 3.36GiB allocated for chunks. 3.36GiB in use in bin. 3.36GiB client-requested in use in bin.
2022-06-01 14:47:24.023262: I tensorflow/core/common_runtime/bfc_allocator.cc:1033] Bin for 191.41MiB was 128.00MiB, Chunk State: 
2022-06-01 14:47:24.023275: I tensorflow/core/common_runtime/bfc_allocator.cc:1039]   Size: 181.04MiB | Requested Size: 4B | in_use: 0 | bin_num: 19, prev:   Size: 256B | Requested Size: 8B | in_use: 1 | bin_num: -1
2022-06-01 14:47:24.023280: I tensorflow/core/common_runtime/bfc_allocator.cc:1046] Next region of size 3808755712
2022-06-01 14:47:24.023288: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac000000 of size 1280 next 1
2022-06-01 14:47:24.023294: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac000500 of size 256 next 2
2022-06-01 14:47:24.023300: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac000600 of size 256 next 3
2022-06-01 14:47:24.023306: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac000700 of size 256 next 4
2022-06-01 14:47:24.023312: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac000800 of size 512 next 5
2022-06-01 14:47:24.023317: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac000a00 of size 256 next 8
2022-06-01 14:47:24.023323: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac000b00 of size 512 next 9
2022-06-01 14:47:24.023328: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac000d00 of size 512 next 10
2022-06-01 14:47:24.023334: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac000f00 of size 512 next 11
2022-06-01 14:47:24.023339: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac001100 of size 512 next 12
2022-06-01 14:47:24.023345: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac001300 of size 256 next 13
2022-06-01 14:47:24.023351: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac001400 of size 256 next 14
2022-06-01 14:47:24.023356: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac001500 of size 512 next 15
2022-06-01 14:47:24.023362: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac001700 of size 512 next 16
2022-06-01 14:47:24.023367: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac001900 of size 512 next 19
2022-06-01 14:47:24.023373: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac001b00 of size 512 next 20
2022-06-01 14:47:24.023378: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac001d00 of size 512 next 21
2022-06-01 14:47:24.023386: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac001f00 of size 128768 next 17
2022-06-01 14:47:24.023392: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac021600 of size 65536 next 18
2022-06-01 14:47:24.023397: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac031600 of size 512 next 22
2022-06-01 14:47:24.023403: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac031800 of size 512 next 25
2022-06-01 14:47:24.023409: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac031a00 of size 512 next 26
2022-06-01 14:47:24.023415: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac031c00 of size 512 next 27
2022-06-01 14:47:24.023420: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac031e00 of size 512 next 28
2022-06-01 14:47:24.023426: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac032000 of size 512 next 31
2022-06-01 14:47:24.023431: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac032200 of size 512 next 32
2022-06-01 14:47:24.023437: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac032400 of size 512 next 33
2022-06-01 14:47:24.023442: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac032600 of size 512 next 34
2022-06-01 14:47:24.023448: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac032800 of size 512 next 35
2022-06-01 14:47:24.023453: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac032a00 of size 256 next 36
2022-06-01 14:47:24.023459: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac032b00 of size 256 next 37
2022-06-01 14:47:24.023464: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac032c00 of size 256 next 38
2022-06-01 14:47:24.023470: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac032d00 of size 256 next 39
2022-06-01 14:47:24.023475: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac032e00 of size 256 next 42
2022-06-01 14:47:24.023481: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac032f00 of size 256 next 43
2022-06-01 14:47:24.023486: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac033000 of size 256 next 44
2022-06-01 14:47:24.023492: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac033100 of size 512 next 55
2022-06-01 14:47:24.023497: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac033300 of size 512 next 56
2022-06-01 14:47:24.023503: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac033500 of size 512 next 57
2022-06-01 14:47:24.023508: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac033700 of size 512 next 58
2022-06-01 14:47:24.023514: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac033900 of size 512 next 59
2022-06-01 14:47:24.023519: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac033b00 of size 512 next 60
2022-06-01 14:47:24.023525: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac033d00 of size 512 next 61
2022-06-01 14:47:24.023530: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac033f00 of size 512 next 63
2022-06-01 14:47:24.023536: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac034100 of size 512 next 64
2022-06-01 14:47:24.023541: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac034300 of size 512 next 65
2022-06-01 14:47:24.023547: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac034500 of size 512 next 66
2022-06-01 14:47:24.023552: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac034700 of size 512 next 67
2022-06-01 14:47:24.023558: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac034900 of size 256 next 71
2022-06-01 14:47:24.023563: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac034a00 of size 256 next 72
2022-06-01 14:47:24.023569: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac034b00 of size 256 next 73
2022-06-01 14:47:24.023574: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac034c00 of size 256 next 40
2022-06-01 14:47:24.023580: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac034d00 of size 4096 next 41
2022-06-01 14:47:24.023586: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac035d00 of size 512 next 45
2022-06-01 14:47:24.023592: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac035f00 of size 512 next 48
2022-06-01 14:47:24.023598: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac036100 of size 512 next 49
2022-06-01 14:47:24.023603: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac036300 of size 512 next 50
2022-06-01 14:47:24.023609: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac036500 of size 512 next 51
2022-06-01 14:47:24.023615: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac036700 of size 512 next 53
2022-06-01 14:47:24.023622: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac036900 of size 512 next 54
2022-06-01 14:47:24.023628: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac036b00 of size 768 next 46
2022-06-01 14:47:24.023633: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac036e00 of size 4096 next 47
2022-06-01 14:47:24.023638: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac037e00 of size 2560 next 68
2022-06-01 14:47:24.023642: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac038800 of size 256 next 76
2022-06-01 14:47:24.023647: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac038900 of size 256 next 77
2022-06-01 14:47:24.023652: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac038a00 of size 256 next 78
2022-06-01 14:47:24.023657: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac038b00 of size 256 next 79
2022-06-01 14:47:24.023661: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac038c00 of size 256 next 80
2022-06-01 14:47:24.023666: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac038d00 of size 256 next 81
2022-06-01 14:47:24.023671: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac038e00 of size 256 next 82
2022-06-01 14:47:24.023675: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac038f00 of size 256 next 83
2022-06-01 14:47:24.023680: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac039000 of size 512 next 85
2022-06-01 14:47:24.023685: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac039200 of size 512 next 86
2022-06-01 14:47:24.023690: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac039400 of size 512 next 87
2022-06-01 14:47:24.023694: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac039600 of size 512 next 88
2022-06-01 14:47:24.023697: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac039800 of size 512 next 89
2022-06-01 14:47:24.023701: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac039a00 of size 512 next 90
2022-06-01 14:47:24.023705: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac039c00 of size 512 next 91
2022-06-01 14:47:24.023709: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac039e00 of size 512 next 92
2022-06-01 14:47:24.023712: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac03a000 of size 512 next 93
2022-06-01 14:47:24.023716: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac03a200 of size 512 next 95
2022-06-01 14:47:24.023720: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac03a400 of size 512 next 96
2022-06-01 14:47:24.023724: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac03a600 of size 512 next 97
2022-06-01 14:47:24.023728: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac03a800 of size 4096 next 98
2022-06-01 14:47:24.023731: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac03b800 of size 256 next 99
2022-06-01 14:47:24.023736: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac03b900 of size 256 next 100
2022-06-01 14:47:24.023740: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac03ba00 of size 256 next 101
2022-06-01 14:47:24.023744: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac03bb00 of size 4096 next 102
2022-06-01 14:47:24.023747: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac03cb00 of size 512 next 103
2022-06-01 14:47:24.023751: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac03cd00 of size 512 next 104
2022-06-01 14:47:24.023755: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac03cf00 of size 512 next 105
2022-06-01 14:47:24.023759: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac03d100 of size 512 next 107
2022-06-01 14:47:24.023762: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac03d300 of size 512 next 108
2022-06-01 14:47:24.023766: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac03d500 of size 512 next 109
2022-06-01 14:47:24.023770: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac03d700 of size 512 next 111
2022-06-01 14:47:24.023774: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac03d900 of size 512 next 112
2022-06-01 14:47:24.023778: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac03db00 of size 512 next 113
2022-06-01 14:47:24.023781: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac03dd00 of size 512 next 115
2022-06-01 14:47:24.023785: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac03df00 of size 512 next 116
2022-06-01 14:47:24.023789: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac03e100 of size 512 next 117
2022-06-01 14:47:24.023793: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac03e300 of size 2560 next 119
2022-06-01 14:47:24.023797: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac03ed00 of size 512 next 121
2022-06-01 14:47:24.023800: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac03ef00 of size 512 next 122
2022-06-01 14:47:24.023804: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac03f100 of size 512 next 123
2022-06-01 14:47:24.023808: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac03f300 of size 512 next 124
2022-06-01 14:47:24.023812: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac03f500 of size 512 next 125
2022-06-01 14:47:24.023816: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac03f700 of size 512 next 126
2022-06-01 14:47:24.023820: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac03f900 of size 512 next 128
2022-06-01 14:47:24.023823: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac03fb00 of size 512 next 129
2022-06-01 14:47:24.023827: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac03fd00 of size 512 next 130
2022-06-01 14:47:24.023831: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac03ff00 of size 512 next 132
2022-06-01 14:47:24.023835: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac040100 of size 512 next 133
2022-06-01 14:47:24.023838: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac040300 of size 512 next 134
2022-06-01 14:47:24.023842: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac040500 of size 4608 next 23
2022-06-01 14:47:24.023846: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac041700 of size 65536 next 24
2022-06-01 14:47:24.023850: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac051700 of size 65536 next 30
2022-06-01 14:47:24.023854: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac061700 of size 65536 next 29
2022-06-01 14:47:24.023858: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac071700 of size 65536 next 52
2022-06-01 14:47:24.023862: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac081700 of size 127232 next 6
2022-06-01 14:47:24.023866: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac0a0800 of size 327680 next 7
2022-06-01 14:47:24.023870: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac0f0800 of size 65536 next 62
2022-06-01 14:47:24.023874: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac100800 of size 327680 next 84
2022-06-01 14:47:24.023878: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac150800 of size 65536 next 94
2022-06-01 14:47:24.023881: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac160800 of size 65536 next 106
2022-06-01 14:47:24.023885: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac170800 of size 65536 next 110
2022-06-01 14:47:24.023889: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac180800 of size 65536 next 114
2022-06-01 14:47:24.023893: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac190800 of size 65536 next 70
2022-06-01 14:47:24.023897: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac1a0800 of size 327680 next 69
2022-06-01 14:47:24.023902: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1eac1f0800 of size 1806336000 next 74
2022-06-01 14:47:24.023906: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1f17c98800 of size 1806336000 next 75
2022-06-01 14:47:24.023910: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1f83740800 of size 327680 next 118
2022-06-01 14:47:24.023913: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1f83790800 of size 327680 next 120
2022-06-01 14:47:24.023917: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1f837e0800 of size 65536 next 127
2022-06-01 14:47:24.023921: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1f837f0800 of size 65536 next 131
2022-06-01 14:47:24.023925: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1f83800800 of size 256 next 135
2022-06-01 14:47:24.023929: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1f83800900 of size 256 next 136
2022-06-01 14:47:24.023932: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1f83800a00 of size 256 next 137
2022-06-01 14:47:24.023936: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1f83800b00 of size 4096 next 138
2022-06-01 14:47:24.023940: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1f83801b00 of size 512 next 139
2022-06-01 14:47:24.023944: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1f83801d00 of size 512 next 140
2022-06-01 14:47:24.023948: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1f83801f00 of size 512 next 141
2022-06-01 14:47:24.023951: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1f83802100 of size 65536 next 142
2022-06-01 14:47:24.023955: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1f83812100 of size 512 next 143
2022-06-01 14:47:24.023959: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1f83812300 of size 512 next 144
2022-06-01 14:47:24.023963: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1f83812500 of size 512 next 145
2022-06-01 14:47:24.023967: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1f83812700 of size 65536 next 146
2022-06-01 14:47:24.023971: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1f83822700 of size 512 next 147
2022-06-01 14:47:24.023974: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1f83822900 of size 512 next 148
2022-06-01 14:47:24.023979: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1f83822b00 of size 512 next 149
2022-06-01 14:47:24.023982: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1f83822d00 of size 65536 next 150
2022-06-01 14:47:24.023986: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1f83832d00 of size 512 next 151
2022-06-01 14:47:24.023990: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1f83832f00 of size 512 next 152
2022-06-01 14:47:24.023994: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1f83833100 of size 512 next 153
2022-06-01 14:47:24.023998: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1f83833300 of size 327680 next 154
2022-06-01 14:47:24.024001: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1f83883300 of size 2560 next 155
2022-06-01 14:47:24.024005: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1f83883d00 of size 256 next 156
2022-06-01 14:47:24.024009: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1f83883e00 of size 256 next 157
2022-06-01 14:47:24.024013: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1f83883f00 of size 256 next 158
2022-06-01 14:47:24.024017: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1f83884000 of size 256 next 159
2022-06-01 14:47:24.024020: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1f83884100 of size 256 next 160
2022-06-01 14:47:24.024024: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1f83884200 of size 256 next 161
2022-06-01 14:47:24.024028: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1f83884300 of size 256 next 162
2022-06-01 14:47:24.024032: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1f83884400 of size 256 next 163
2022-06-01 14:47:24.024035: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1f83884500 of size 256 next 164
2022-06-01 14:47:24.024039: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1f83884600 of size 256 next 165
2022-06-01 14:47:24.024043: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1f83884700 of size 256 next 166
2022-06-01 14:47:24.024047: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1f83884800 of size 256 next 167
2022-06-01 14:47:24.024051: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1f83884900 of size 256 next 168
2022-06-01 14:47:24.024054: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1f83884a00 of size 256 next 169
2022-06-01 14:47:24.024058: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1f83884b00 of size 256 next 170
2022-06-01 14:47:24.024062: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1f83884c00 of size 256 next 171
2022-06-01 14:47:24.024066: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1f83884d00 of size 256 next 172
2022-06-01 14:47:24.024069: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] Free  at 7f1f83884e00 of size 327680 next 232
2022-06-01 14:47:24.024073: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1f838d4e00 of size 256 next 206
2022-06-01 14:47:24.024077: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] Free  at 7f1f838d4f00 of size 2556416 next 178
2022-06-01 14:47:24.024081: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] InUse at 7f1f83b45100 of size 256 next 179
2022-06-01 14:47:24.024085: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] Free  at 7f1f83b45200 of size 189836800 next 18446744073709551615
2022-06-01 14:47:24.024088: I tensorflow/core/common_runtime/bfc_allocator.cc:1071]      Summary of in-use Chunks by size: 
2022-06-01 14:47:24.024094: I tensorflow/core/common_runtime/bfc_allocator.cc:1074] 50 Chunks of size 256 totalling 12.5KiB
2022-06-01 14:47:24.024099: I tensorflow/core/common_runtime/bfc_allocator.cc:1074] 87 Chunks of size 512 totalling 43.5KiB
2022-06-01 14:47:24.024104: I tensorflow/core/common_runtime/bfc_allocator.cc:1074] 1 Chunks of size 768 totalling 768B
2022-06-01 14:47:24.024108: I tensorflow/core/common_runtime/bfc_allocator.cc:1074] 1 Chunks of size 1280 totalling 1.2KiB
2022-06-01 14:47:24.024113: I tensorflow/core/common_runtime/bfc_allocator.cc:1074] 3 Chunks of size 2560 totalling 7.5KiB
2022-06-01 14:47:24.024117: I tensorflow/core/common_runtime/bfc_allocator.cc:1074] 5 Chunks of size 4096 totalling 20.0KiB
2022-06-01 14:47:24.024121: I tensorflow/core/common_runtime/bfc_allocator.cc:1074] 1 Chunks of size 4608 totalling 4.5KiB
2022-06-01 14:47:24.024126: I tensorflow/core/common_runtime/bfc_allocator.cc:1074] 16 Chunks of size 65536 totalling 1.00MiB
2022-06-01 14:47:24.024130: I tensorflow/core/common_runtime/bfc_allocator.cc:1074] 1 Chunks of size 127232 totalling 124.2KiB
2022-06-01 14:47:24.024135: I tensorflow/core/common_runtime/bfc_allocator.cc:1074] 1 Chunks of size 128768 totalling 125.8KiB
2022-06-01 14:47:24.024139: I tensorflow/core/common_runtime/bfc_allocator.cc:1074] 6 Chunks of size 327680 totalling 1.88MiB
2022-06-01 14:47:24.024144: I tensorflow/core/common_runtime/bfc_allocator.cc:1074] 2 Chunks of size 1806336000 totalling 3.36GiB
2022-06-01 14:47:24.024148: I tensorflow/core/common_runtime/bfc_allocator.cc:1078] Sum Total of in-use chunks: 3.37GiB
2022-06-01 14:47:24.024152: I tensorflow/core/common_runtime/bfc_allocator.cc:1080] total_region_allocated_bytes_: 3808755712 memory_limit_: 3808755712 available bytes: 0 curr_region_allocation_bytes_: 7617511424
2022-06-01 14:47:24.024159: I tensorflow/core/common_runtime/bfc_allocator.cc:1086] Stats: 
Limit:                      3808755712
InUse:                      3616034816
MaxInUse:                   3626576640
NumAllocs:                      368395
MaxAllocSize:               1806336000
Reserved:                            0
PeakReserved:                        0
LargestFreeBlock:                    0

2022-06-01 14:47:24.024167: W tensorflow/core/common_runtime/bfc_allocator.cc:474] ************************************************************************************************____
Traceback (most recent call last):
  File "/home/alxhoff/git/GitHub/tiny/benchmark/training/anomaly_detection/00_train.py", line 208, in <module>
    history = model.fit(train_data[:len(train_data)],
  File "/home/alxhoff/.local/lib/python3.8/site-packages/keras/utils/traceback_utils.py", line 67, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "/home/alxhoff/.local/lib/python3.8/site-packages/tensorflow/python/framework/constant_op.py", line 106, in convert_to_eager_tensor
    return ops.EagerTensor(value, ctx.device_name, dtype)
tensorflow.python.framework.errors_impl.InternalError: Failed copying input tensor from /job:localhost/replica:0/task:0/device:CPU:0 to /job:localhost/replica:0/task:0/device:GPU:0 in order to run _EagerConst: Dst tensor is not ini

I should mention that as the error message suggest, TF_GPU_ALLOCATOR=cuda_malloc_async did not solve the issue.

Cheers,

Alex