Open AlexCHEU opened 2 years ago
shuffle_mb
and prefetch_mb
in dataset.py
.
- fine
- Looks like memory OOM issue. You can try reducing
shuffle_mb
andprefetch_mb
indataset.py
.
Thank you so much for your prompt reply! :raised_hands:
I have fined the TFRecord.file, set shuffle_mb=0
, prefetch_mb=0
, sched.minibatch_size_base =4
, sched.minibatch_gpu_base = 2
but it still giving errors:
tensorflow.python.framework.errors_impl.OutOfRangeError: 2 root error(s) found.
(0) Out of range: End of sequence
[[{{node GPU0/DataFetch/IteratorGetNext}}]]
(1) Out of range: End of sequence
[[{{node GPU0/DataFetch/IteratorGetNext}}]]
[[GPU0/DataFetch/IteratorGetNext/_2837]]
0 successful operations.
0 derived errors ignored.
tensorflow.python.framework.errors_impl.OutOfRangeError: 2 root error(s) found.
(0) Out of range: End of sequence
[[node GPU0/DataFetch/IteratorGetNext (defined at /tensorflow-1.15.2/python3.7/tensorflow_core/python/framework/ops.py:1748) ]]
(1) Out of range: End of sequence
[[node GPU0/DataFetch/IteratorGetNext (defined at /tensorflow-1.15.2/python3.7/tensorflow_core/python/framework/ops.py:1748) ]]
[[GPU0/DataFetch/IteratorGetNext/_2837]]
0 successful operations.
0 derived errors ignored.
so confused...there are just 2.5k images:open_mouth:
Hello, may I ask if you have solved this problem?
I have the same problem, have you solve it? @AlexCHEU
Hello, may I ask if you have solved this problem?
Hello, thanks for your wonderful work!
I have a question about running
run_training.py
with a custom dataset.create_from_images.py
, and finally got one TFRecord file. I was wondering if one single file is OK? (2.07G contains 2k images)python run_training.py --data-dir=<>d --result-dir=<> --dataset="train" --num-gpus=1 --total-kimg=10000 --mirror-augment=True
Is it related to the TensorFlow version? I am trying to implement the training session with Google Colab, it only provides Tensorflow 1.15.2 now...
Could you please help me finger what I did wrong?
Thanks for any of your help and happy CNY :)