Closed norbertsk9 closed 3 years ago
Hi, I have the same error. Some time ago, I used script from here: https://github.com/curiousily/Deep-Learning-For-Hackers/blob/master/15.object-detection.ipynb
Now, when I run it, same as yours error appears. I can see that most of the packages are upgraded. I am trying to downgrade some components, but so far no success.
I have the same problem.
I am having the same problem. Has anyone figured out what might be causing this?
No. Problem still exists. I am using tensorflow==2.3.0 and keras==2.4.3.
I am using Keras 2.4.3 and Tensorflow 2.3.0 as well....
But I have noticed that I have a paperspace gpu server that runs this code just fine. It is running Tensorflow 1.14.0 and keras version 2.3.1
I have tried downgrading my tensorflow to 1.14.0 and keras to 2.3.1 but I get a different set of errors then. I will post what they are here in a few minutes. Once I recreate it again lol.
So I just did the following
pip uninstall keras-resnet pip uninstall keras-retinanet pip uninstall Keras-Preprocessing pip uninstall Keras-Applications pip uninstall tensorflow pip uninstall tensorflow-gpu
Then pip install tensorflow==1.14.0 pip install tensorflow-gpu==1.14.0
Then I reran this code pip install numpy --user pip install . --user python setup.py build_ext --inplace
And reran my model. I got an error saying keras retinanet requires at least tensorflow 2.2 witch shocks me since I have it running on a paperspace gpu server with tensorflow 1.14.0
But anyways I then did pip uninstall tensorflow and pip uninstall tensorflow-gpu and install tensorflow==2.2 and pip install tensorflow-gpu==2.2
I then tried to run the model again and got this new error "UboundLocalError: local variable 'retval_' reference before assignment"
After that, I uninstalled tensorflow and tensorflow gpu again and install tensorflow 2.3.0 again and am still getting the error.
"WARNING:tensorflow:Your input ran out of data; interrupting training. Make sure that your dataset or generator can generate at least steps_per_epoch * epochs
batches (in this case, 5000 batches). You may need to use the repeat() function when building your dataset"
So I am kind of at a loss. Just not sure what to try next :|
I also tried to downgrade tensorflow and keras, but it doesnt give any effect.
So I just tried creating a new conda environment and using the pip list from my paperspace GPU server.
Here is the pip list from that sever. PYTHON VERSION 3.7.5 Package Version
absl-py 0.7.1 apturl 0.5.2 asn1crypto 0.24.0 astor 0.8.0 attrs 19.1.0 Automat 0.6.0 backcall 0.1.0 bleach 3.1.0 blinker 1.4 Brlapi 0.6.6 certifi 2018.1.18 chardet 3.0.4 click 6.7 cloud-init 19.1 colorama 0.3.7 command-not-found 0.3 configobj 5.0.6 constantly 15.1.0 cryptography 2.1.4 cupshelpers 1.0 cycler 0.10.0 Cython 0.29.21 decorator 4.4.0 defer 1.0.6 defusedxml 0.6.0 distro-info 0.18ubuntu0.18.04.1 entrypoints 0.3 gast 0.2.2 google-pasta 0.1.7 grpcio 1.22.0 h5py 2.9.0 html5lib 0.999999999 httplib2 0.9.2 hyperlink 17.3.1 idna 2.6 incremental 16.10.1 ipykernel 5.1.1 ipython 7.6.1 ipython-genutils 0.2.0 ipywidgets 7.5.0 jedi 0.14.1 Jinja2 2.10.1 joblib 0.13.2 jsonpatch 1.16 jsonpointer 1.10 jsonschema 3.0.1 jupyter 1.0.0 jupyter-client 5.3.1 jupyter-console 6.0.0 jupyter-core 4.5.0 Keras 2.3.1 Keras-Applications 1.0.8 Keras-Preprocessing 1.1.0 keras-resnet 0.1.0 keras-retinanet 0.5.1 keyring 10.6.0 keyrings.alt 3.0 kiwisolver 1.1.0 language-selector 0.1 launchpadlib 1.10.6 lazr.restfulclient 0.13.5 lazr.uri 1.0.3 linecache2 1.0.0 louis 3.5.0 lxml 4.5.2 macaroonbakery 1.1.3 Mako 1.0.7 Markdown 3.1.1 MarkupSafe 1.1.1 matplotlib 3.1.1 mistune 0.8.4 nbconvert 5.5.0 nbformat 4.4.0 netifaces 0.10.4 notebook 6.0.0 numpy 1.16.4 oauth 1.0.1 oauthlib 2.0.6 olefile 0.45.1 opencv-python 4.1.0.25 PAM 0.4.2 pandas 0.25.0 pandocfilters 1.4.2 parso 0.5.1 pbr 3.1.1 pexpect 4.7.0 pickleshare 0.7.5 Pillow 6.1.0 pip 20.2 progressbar 2.5 progressbar2 3.51.4 prometheus-client 0.7.1 prompt-toolkit 2.0.9 protobuf 3.9.0 ptyprocess 0.6.0 pyasn1 0.4.2 pyasn1-modules 0.2.1 pycairo 1.16.2 pycrypto 2.6.1 pycups 1.9.73 Pygments 2.4.2 pygobject 3.26.1 PyJWT 1.5.3 pymacaroons 0.13.0 PyNaCl 1.1.2 pyOpenSSL 17.5.0 pyparsing 2.4.1.1 PyQt5 5.10.1 pyRFC3339 1.0 pyrsistent 0.15.3 pyserial 3.4 python-apt 1.6.5+ubuntu0.2 python-dateutil 2.8.0 python-debian 0.1.32 python-utils 2.4.0 pytz 2019.1 pyxdg 0.25 PyYAML 5.1.1 pyzmq 18.0.2 qtconsole 4.5.2 reportlab 3.4.0 requests 2.18.4 requests-unixsocket 0.1.5 scikit-learn 0.21.2 scipy 1.3.0 screen-resolution-extra 0.0.0 SecretStorage 2.3.1 Send2Trash 1.5.0 service-identity 16.0.0 setuptools 41.0.1 simplegeneric 0.8.1 simplejson 3.13.2 sip 4.19.8 six 1.12.0 ssh-import-id 5.7 system-service 0.3 systemd-python 234 tensorboard 1.14.0 tensorflow 1.14.0 tensorflow-estimator 1.14.0 tensorflow-gpu 1.14.0 termcolor 1.1.0 terminado 0.8.2 testpath 0.4.2 testresources 2.0.0 Theano 1.0.4 torch 1.1.0 torchvision 0.3.0 tornado 6.0.3 traceback2 1.4.0 traitlets 4.3.2 Twisted 17.9.0 ubuntu-drivers-common 0.0.0 ufw 0.36 unattended-upgrades 0.1 unittest2 1.1.0 urllib3 1.22 usb-creator 0.3.3 virtualenv 15.1.0 wadllib 1.3.2 wcwidth 0.1.7 webencodings 0.5.1 Werkzeug 0.15.5 wheel 0.33.4 widgetsnbextension 3.5.0 wrapt 1.11.2 xkit 0.0.0 zope.interface 4.3.2
I created a new conda enviorment install numpy version 1.16.4 then install tensorflow version 1.14.0 and tensorflow gpu version 1.14.0 and I think had to do a pip isntall keras-retinanet
but after that, I reran my code and it did not give me the error saying that I need to figure out how to make it repeat.
But now tensorflow is not utilizing my gpu's :( so kindof defeats the purpose lol
Has someone solved the issue?
I found two workarounds.
Use --steps
argument while train.
Steps should be smaller than or equal to length of your dataset / batch size
.
For example:
Your dataset has 1000 images and batch size is 1 --steps 1000
Your dataset has 1000 images and batch size is 2 --steps 500
Change default value of steps to None
and do not use --steps
argument while train.
https://github.com/fizyr/keras-retinanet/blob/8536cab6baafa8ae3beaa4f62e01cbad872e9884/keras_retinanet/bin/train.py#L436
Tesnorflow(keras) will calculate proper step automatically.
the solution proposed by @hansoli68 works, but note that for the first version, --steps
must be equal to the total number of unique images in your training set (divided by batch size) - I was making the mistake of using the total number of training labels, but some images have more than one training label, and my run failed until I determined the number of unique images.
If the second approach of setting step=None
works, that seems more foolproof. In fact, as currently constructed, it apparently doesn't even make sense to have steps
be a mutable parameter?
if you using ImageDataGenerator function, try to change the batch_size method inside of flow_from_directory function, like this:
instantiating and setting up ImageDataGenerator:
training_generator = ImageDataGenerator(rescale=1./255, rotation_range=7, horizontal_flip=True, shear_range=0.2, height_shift_range=0.07, zoom_range=0.2)
test_generator = ImageDataGenerator(rescale=1./255)
setting up training and test database: here inside "flow_from_directory" set the "batch_size" to 1 if you wanna to use all files in your training and test database
training_base= training_generator.flow_from_directory('path_to_directory' , target_size=(100,100),batch_size=1, class_mode='binary')
test_base = test_generator.flow_from_directory('path_to_directory', target_size=(100, 100),batch_size=1, class_mode='binary')
after that, set the value for "steps_per_epoch" using the total number of files in your training database divided for batch_size set value, in this case 1. You need to do the same thing in "validation_steps", but instead use total value of training set, divided by total value of test database
classifier.fit_generator(training_base,steps_per_epoch=5216/1, epochs=5, validation_data= test_base, validation_steps=624/1)
hope it helps you. sorry for my english.
I have done as @Andre-Vitorino suggest. Then I dont get that error, but it looks like the training works only because in each epoch the network looks at the same original images, because the validation accuracy doesnt change regardless of what learning rate is set. So it makes training work but doesnt solve the underlying problem - that the images are not augmented.
if you using ImageDataGenerator function, try to change the batch_size method inside of flow_from_directory function, like this:
instantiating and setting up ImageDataGenerator:
training_generator = ImageDataGenerator(rescale=1./255, rotation_range=7, horizontal_flip=True, shear_range=0.2, height_shift_range=0.07, zoom_range=0.2)
test_generator = ImageDataGenerator(rescale=1./255)
setting up training and test database: here inside "flow_from_directory" set the "batch_size" to 1 if you wanna to use all files in your training and test database
training_base= training_generator.flow_from_directory('path_to_directory' , target_size=(100,100),batch_size=1, class_mode='binary')
test_base = test_generator.flow_from_directory('path_to_directory', target_size=(100, 100),batch_size=1, class_mode='binary')
after that, set the value for "steps_per_epoch" using the total number of files in your training database divided for batch_size set value, in this case 1. You need to do the same thing in "validation_steps", but instead use total value of training set, divided by total value of test database
classifier.fit_generator(training_base,steps_per_epoch=5216/1, epochs=5, validation_data= test_base, validation_steps=624/1)
hope it helps you. sorry for my english.
i also used your style and it solve the problem however the accuracy show the same value
Hello, During training with Google Colab train.py such error occured: WARNING:tensorflow:Your input ran out of data; interrupting training. Make sure that your dataset or generator can generate at least
steps_per_epoch * epochs
batches (in this case, 5000 batches). You may need to use the repeat() function when building your dataset. Thanks for your help.
it has worked for me when I changed the number in steps per epoch where it was showing me error after step 2429 as my dataset has overall images of 2430 I changed steps_per_epoch = 2429 then it started running without any error
Thank you
On Tue, 3 Nov 2020, 10:41 pm Hrithik Sagar, notifications@github.com wrote:
Hello, During training with Google Colab train.py such error occured: WARNING:tensorflow:Your input ran out of data; interrupting training. Make sure that your dataset or generator can generate at least steps_per_epoch
- epochs batches (in this case, 5000 batches). You may need to use the repeat() function when building your dataset. Thanks for your help.
it has worked for me when I changed the number in steps per epoch where it was showing me error after step 2429 as my dataset has overall images of 2430 I changed steps_per_epoch = 2429 then it started running without any error
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/fizyr/keras-retinanet/issues/1449#issuecomment-721156730, or unsubscribe https://github.com/notifications/unsubscribe-auth/AP7NQ6L3LOY4Y5QHEAICHVDSOAJAZANCNFSM4QPYZGXA .
This issue has been automatically marked as stale due to the lack of recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
The code i run: new_model.fit_generator(train_generator,validation_data=(x_valid,y_valid),steps_per_epoch=len(x_train),epochs=2)
The error i got: *WARNING:tensorflow:Your input ran out of data; interrupting training. Make sure that your dataset or generator can generate at least `steps_per_epoch epochs` batches (in this case, 16928 batches). You may need to use the repeat() function when building your dataset**
Anyone know how to solve it?
The code i run: new_model.fit_generator(train_generator,validation_data=(x_valid,y_valid),steps_per_epoch=len(x_train),epochs=2)
The error i got: *WARNING:tensorflow:Your input ran out of data; interrupting training. Make sure that your dataset or generator can generate at least `steps_per_epoch epochs` batches (in this case, 16928 batches). You may need to use the repeat() function when building your dataset**
Anyone know how to solve it?
Try reducing the steps_per_epoch value below the value you have currently set. This helped me solve the problem
I am having the same issue, but curiously, it works only with my validation generator. It does produce 40 batches in the case, as I could attest producing data in a for loop, but when used by the fit method, it says that the generator should be able to produce 40 batches but doesn't, then validation is skipped. Still weirder, setting validation_steps to 39 does not help, but 38 does. I use the very same generator with my training data and it is working fine.
if you using ImageDataGenerator function, try to change the batch_size method inside of flow_from_directory function, like this:
instantiating and setting up ImageDataGenerator:
training_generator = ImageDataGenerator(rescale=1./255, rotation_range=7, horizontal_flip=True, shear_range=0.2, height_shift_range=0.07, zoom_range=0.2)
test_generator = ImageDataGenerator(rescale=1./255)
setting up training and test database: here inside "flow_from_directory" set the "batch_size" to 1 if you wanna to use all files in your training and test database
training_base= training_generator.flow_from_directory('path_to_directory' , target_size=(100,100),batch_size=1, class_mode='binary')
test_base = test_generator.flow_from_directory('path_to_directory', target_size=(100, 100),batch_size=1, class_mode='binary')
after that, set the value for "steps_per_epoch" using the total number of files in your training database divided for batch_size set value, in this case 1. You need to do the same thing in "validation_steps", but instead use total value of training set, divided by total value of test database
classifier.fit_generator(training_base,steps_per_epoch=5216/1, epochs=5, validation_data= test_base, validation_steps=624/1)
hope it helps you. sorry for my english.
هذا الحل يعمل معي . شكرا لك
This solution works for me. Thank you
Cette solution fonctionne pour moi. Merci
Hi, I was also facing the same issue but I got it resolved by only making a bit modification in the code. I passed the perimeter 'steps_per_epoch' as an integer instead of floating point and the issue got resolved.
UserWarning: Your input ran out of data; interrupting training. https://github.com/okbabent/-fev22-ocular-disease/issues/1
hi, can u help me solve error error always occurs in even epochs
Hello, During training with Google Colab train.py such error occured: WARNING:tensorflow:Your input ran out of data; interrupting training. Make sure that your dataset or generator can generate at least
steps_per_epoch * epochs
batches (in this case, 5000 batches). You may need to use the repeat() function when building your dataset. Thanks for your help.