Closed Kenneth-X closed 7 years ago
What is your batch size? Try to reduce it.
The --not-restore-last
will be needed when you are trying to restore the model trained on a dataset with X
number of classes, while your new dataset has Y
number of classes (where X != Y
). If it is passed, then the last layers (where the channel dimension is equal to the number of classes) will be re-initialised randomly.
Another use case of this flag might be when you are fine-tuning your trained model on a new portion of data with the same number of classes.
there is some wrong when i using my own datasets with 18 class i have done all the instruction from but it still occur these following problem.My GPU is K80 ,the log is here:
I tensorflow/core/common_runtime/bfc_allocator.cc:696] 1 Chunks of size 1280 totalling 1.2KiB I tensorflow/core/common_runtime/bfc_allocator.cc:696] 68 Chunks of size 2048 totalling 136.0KiB I tensorflow/core/common_runtime/bfc_allocator.cc:696] 151 Chunks of size 4096 totalling 604.0KiB I tensorflow/core/common_runtime/bfc_allocator.cc:696] 1 Chunks of size 7936 totalling 7.8KiB I tensorflow/core/common_runtime/bfc_allocator.cc:696] 32 Chunks of size 8192 totalling 256.0KiB I tensorflow/core/common_runtime/bfc_allocator.cc:696] 2 Chunks of size 16384 totalling 32.0KiB I tensorflow/core/common_runtime/bfc_allocator.cc:696] 2 Chunks of size 37632 totalling 73.5KiB I tensorflow/core/common_runtime/bfc_allocator.cc:696] 8 Chunks of size 65536 totalling 512.0KiB I tensorflow/core/common_runtime/bfc_allocator.cc:696] 2 Chunks of size 131072 totalling 256.0KiB I tensorflow/core/common_runtime/bfc_allocator.cc:696] 4 Chunks of size 147456 totalling 576.0KiB I tensorflow/core/common_runtime/bfc_allocator.cc:696] 8 Chunks of size 262144 totalling 2.00MiB I tensorflow/core/common_runtime/bfc_allocator.cc:696] 1 Chunks of size 376832 totalling 368.0KiB I tensorflow/core/common_runtime/bfc_allocator.cc:696] 4 Chunks of size 524288 totalling 2.00MiB I tensorflow/core/common_runtime/bfc_allocator.cc:696] 5 Chunks of size 589824 totalling 2.81MiB I tensorflow/core/common_runtime/bfc_allocator.cc:696] 57 Chunks of size 1048576 totalling 57.00MiB I tensorflow/core/common_runtime/bfc_allocator.cc:696] 5 Chunks of size 1327104 totalling 6.33MiB I tensorflow/core/common_runtime/bfc_allocator.cc:696] 1 Chunks of size 1556480 totalling 1.48MiB I tensorflow/core/common_runtime/bfc_allocator.cc:696] 1 Chunks of size 1572864 totalling 1.50MiB I tensorflow/core/common_runtime/bfc_allocator.cc:696] 4 Chunks of size 2097152 totalling 8.00MiB I tensorflow/core/common_runtime/bfc_allocator.cc:696] 29 Chunks of size 2359296 totalling 65.25MiB I tensorflow/core/common_runtime/bfc_allocator.cc:696] 1 Chunks of size 2805760 totalling 2.68MiB I tensorflow/core/common_runtime/bfc_allocator.cc:696] 1 Chunks of size 3664128 totalling 3.49MiB I tensorflow/core/common_runtime/bfc_allocator.cc:696] 6 Chunks of size 4194304 totalling 24.00MiB I tensorflow/core/common_runtime/bfc_allocator.cc:696] 1 Chunks of size 5992448 totalling 5.71MiB I tensorflow/core/common_runtime/bfc_allocator.cc:696] 2 Chunks of size 8388608 totalling 16.00MiB I tensorflow/core/common_runtime/bfc_allocator.cc:696] 4 Chunks of size 9437184 totalling 36.00MiB I tensorflow/core/common_runtime/bfc_allocator.cc:700] Sum Total of in-use chunks: 237.38MiB I tensorflow/core/common_runtime/bfc_allocator.cc:702] Stats: Limit: 248905728 InUse: 248905728 MaxInUse: 248905728 NumAllocs: 905 MaxAllocSize: 9437184
W tensorflow/core/common_runtime/bfc_allocator.cc:274] **** W tensorflow/core/common_runtime/bfc_allocator.cc:275] Ran out of memory trying to allocate 1.27MiB. See logs for memory state. W tensorflow/core/framework/op_kernel.cc:993] Resource exhausted: OOM when allocating tensor with shape[3,3,2048,18]
And also, ''If restoring weights from the PASCAL models for your dataset with a different number of classes, you will also need to pass the --not-restore-last flag, which will prevent the last layers of size 21 from being restored.'' I didn't find the specific use of the ''--not-restore-last'' in your code. can you explain it ???