AIWintermuteAI / aXeleRate

Keras-based framework for AI on the Edge
MIT License
179 stars 71 forks source link

After installation of aXeleRate, test_training_inference.py are freezing #33

Closed orossant closed 3 years ago

orossant commented 3 years ago

Describe the bug I have followed your very helpfull and clear procedure explained here : https://www.instructables.com/Object-Detection-With-Sipeed-MaiX-BoardsKendryte-K/

After having installed conda, activating an environment and installing aXelerate on my Mac, I have launched some tests to check all is fine. So I have made different test : python ./tests_training_inference.py python ./tests_training_inference.py -t classifier -a 'Tini Yolo' python ./tests_training_inference.py -t classifier -a 'Full Yolo' there is always the same error : Epoch 1/5 is ok, but when it comes to Epoch 2/5, it is freezing at 1/5 steps

To Reproduce Steps to reproduce the behavior: just follow https://www.instructables.com/Object-Detection-With-Sipeed-MaiX-BoardsKendryte-K/

Expected behavior A clear and concise description of what you expected to happen.

Expected behavior : be able to run Epoch 1 to 5 without any errors or freeze

Screenshots Epoch 1/5 5/5 [==============================] - 8s 1s/step - loss: 1.6890 - accuracy: 0.2889 - val_loss: 1.6095 - val_accuracy: 0.2000

Epoch 00001: val_accuracy improved from -inf to 0.20000, saving model to projects/classifier/2021-01-08_15-34-56/Classifier_best_val_accuracy.h5 Epoch 00000: Learning rate is 2.6666666666666667e-05.

Epoch 2/5 1/5 [=====>........................] - ETA: 6s - loss: 1.4810 - accuracy: 0.5000

Environment (please complete the following information): environment local : MacOs X Catalina, miniconda 3 installed and one dedicated environment activated conda create -n yolo python=3.7 conda activate yolo pip install git+https://github.com/AIWintermuteAI/aXeleRate inside aXeleRate folder python ./tests_training_inference.py

Additional context Add any other context about the problem here. I have found many people have similar pbs but in different context https://www.google.com/search?client=firefox-b-e&q=keras+freeze+during+training

orossant commented 3 years ago

additional info to give all package version inside my environment

conda list  ✔  yolo2   16:16:45 

packages in environment at /Users/orossant/miniconda3/envs/yolo2:

#

Name Version Build Channel

absl-py 0.11.0 pypi_0 pypi astunparse 1.6.3 pypi_0 pypi axelerate 0.7.0 pypi_0 pypi ca-certificates 2020.12.8 hecd8cb5_0 cachetools 4.2.0 pypi_0 pypi certifi 2020.12.5 py37hecd8cb5_0 chardet 4.0.0 pypi_0 pypi cycler 0.10.0 pypi_0 pypi decorator 4.4.2 pypi_0 pypi defusedxml 0.6.0 pypi_0 pypi flatbuffers 1.12 pypi_0 pypi gast 0.3.3 pypi_0 pypi google-auth 1.24.0 pypi_0 pypi google-auth-oauthlib 0.4.2 pypi_0 pypi google-pasta 0.2.0 pypi_0 pypi grpcio 1.32.0 pypi_0 pypi h5py 2.10.0 pypi_0 pypi idna 2.10 pypi_0 pypi imageio 2.9.0 pypi_0 pypi imgaug 0.4.0 pypi_0 pypi importlib-metadata 3.3.0 pypi_0 pypi jinja2 2.11.2 pypi_0 pypi joblib 1.0.0 pypi_0 pypi keras-preprocessing 1.1.2 pypi_0 pypi kiwisolver 1.3.1 pypi_0 pypi libcxx 10.0.0 1 libedit 3.1.20191231 h1de35cc_1 libffi 3.3 hb1e8313_2 markdown 3.3.3 pypi_0 pypi markupsafe 1.1.1 pypi_0 pypi matplotlib 3.3.3 pypi_0 pypi ncurses 6.2 h0a44026_1 networkx 2.5 pypi_0 pypi numpy 1.19.5 pypi_0 pypi oauthlib 3.1.0 pypi_0 pypi onnx 1.8.0 pypi_0 pypi opencv-python 4.1.2.30 pypi_0 pypi openssl 1.1.1i h9ed2024_0 opt-einsum 3.3.0 pypi_0 pypi pascal-voc-writer 0.1.4 pypi_0 pypi pillow 8.1.0 pypi_0 pypi pip 20.3.3 py37hecd8cb5_0 protobuf 3.14.0 pypi_0 pypi pyasn1 0.4.8 pypi_0 pypi pyasn1-modules 0.2.8 pypi_0 pypi pyparsing 2.4.7 pypi_0 pypi python 3.7.9 h26836e1_0 python-dateutil 2.8.1 pypi_0 pypi pywavelets 1.1.1 pypi_0 pypi readline 8.0 h1de35cc_0 requests 2.25.1 pypi_0 pypi requests-oauthlib 1.3.0 pypi_0 pypi rsa 4.6 pypi_0 pypi scikit-image 0.18.1 pypi_0 pypi scikit-learn 0.24.0 pypi_0 pypi scipy 1.6.0 pypi_0 pypi setuptools 51.0.0 py37hecd8cb5_2 shapely 1.7.1 pypi_0 pypi six 1.15.0 pypi_0 pypi sklearn 0.0 pypi_0 pypi sqlite 3.33.0 hffcf06c_0 tensorboard 2.4.0 pypi_0 pypi tensorboard-plugin-wit 1.7.0 pypi_0 pypi tensorflow 2.4.0 pypi_0 pypi tensorflow-estimator 2.4.0 pypi_0 pypi termcolor 1.1.0 pypi_0 pypi tf2onnx 1.7.2 pypi_0 pypi threadpoolctl 2.1.0 pypi_0 pypi tifffile 2020.12.8 pypi_0 pypi tk 8.6.10 hb0a8c7a_0 tqdm 4.55.1 pypi_0 pypi typing-extensions 3.7.4.3 pypi_0 pypi urllib3 1.26.2 pypi_0 pypi werkzeug 1.0.1 pypi_0 pypi wheel 0.36.2 pyhd3eb1b0_0 wrapt 1.12.1 pypi_0 pypi xz 5.2.5 h1de35cc_0 zipp 3.4.0 pypi_0 pypi zlib 1.2.11 h1de35cc_3

AIWintermuteAI commented 3 years ago

Hello! Well, the main issue that I'm seeing here is that you're using Mac OS. aXeleRate is meant to be run (and tested only) on Linux(Ubuntu 18.04) and Google Colab. While training theoretically should work on both Windows and Mac OS, aXeleRate is primarily meant as framework for training AND conversion of models to be run on embedded devices. The conversions step utilizes various converters , some of them (such as Google Edge TPU model converter) do no run anywhere except Linux, and some others(such as nncase)are unstable and buggy on Win/Mac. I need to add this to README :)

Meanwhile, you can try running training in Google Colab, where you can utilize GPUs. Alternatively, if you want to run aXeleRate locally on Mac computer, you can install and run it in virtual machine.