zsyzzsoft / co-mod-gan

[ICLR 2021, Spotlight] Large Scale Image Completion via Co-Modulated Generative Adversarial Networks
Other
445 stars 67 forks source link

Question about custom dataset preparing, tensorflow.python.framework.errors_impl.OutOfRangeError: 2 root error(s) found. #30

Open LigZhong opened 3 years ago

LigZhong commented 3 years ago

Hello, thanks for your great job. I have encountered a problem when trying to do training with my own dataset. I created tfrecord with my own dataset(with jpg files only). I run python scripts as indicated but when I run the training code, there is such an error:

python3 run_training.py --data-dir ./dataset --dataset custom_3 --num-gpus 1 --metrics=ids36k5 --total-kimg 5000 Local submit - run_dir: results/00022-co-mod-gan-custom_3-1gpu dnnlib: Running training.training_loop.training_loop() on localhost... Streaming data using training.dataset.TFRecordDataset... tfrecord_dir: dataset/custom_3 max_shape: [3, 256, 256] Dataset shape = [3, 256, 256] Dynamic range = [0, 255] Label size = 0

Building TensorFlow graph... Initializing logs... Training for 50000 kimg...

Traceback (most recent call last): File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1365, in _do_call return fn(*args) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1350, in _run_fn target_list, run_metadata) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1443, in _call_tf_sessionrun run_metadata) tensorflow.python.framework.errors_impl.OutOfRangeError: 2 root error(s) found. (0) Out of range: End of sequence [[{{node GPU0/DataFetch/IteratorGetNext}}]] (1) Out of range: End of sequence [[{{node GPU0/DataFetch/IteratorGetNext}}]] [[GPU0/DataFetch/IteratorGetNext/_2837]] 0 successful operations. 0 derived errors ignored.

Can anyone give a hint?

liupgd commented 3 years ago

I also get 'Out of range: End of sequence' error. I added validation dir in dataset preparing, and this error has been solved. You can try it:

python dataset_tools/create_from_images.py --train-image-dir ./imgs/png_samples/ --val-image-dir ./imgs/png_samples/ --tfrecord-dir ./train_dataset --resolution 512 --num-channels 3

mostafa610 commented 3 years ago

i think you are trying to use more than 1 gpu and you only have one

zzz105120 commented 2 years ago

Hello, may I ask if you have solved this problem?