OlafenwaMoses / ImageAI

A python library built to empower developers to build applications and systems with self-contained Computer Vision capabilities
https://www.genxr.co/#products
MIT License
8.66k stars 2.19k forks source link

Problem with idenprof! #661

Open danydumont opened 3 years ago

danydumont commented 3 years ago

This weekend I tried the idenprof model downloaded from your github, with the mechanic pictures (4.jpg) and the model give me this results: see below

Do you know what can be my problem?

Thank you in advance for your help

"python3 FirstCustomImageRecognition.py 2021-05-09 21:44:57.542734: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory 2021-05-09 21:44:57.542760: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine. 2021-05-09 21:44:58.607180: I tensorflow/compiler/jit/xla_cpu_device.cc:41] Not creating XLA devices, tf_xla_enable_xla_devices not set 2021-05-09 21:44:58.607370: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory 2021-05-09 21:44:58.607381: W tensorflow/stream_executor/cuda/cuda_driver.cc:326] failed call to cuInit: UNKNOWN ERROR (303) 2021-05-09 21:44:58.607412: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (pop-os): /proc/driver/nvidia/version does not exist 2021-05-09 21:44:58.607584: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags. 2021-05-09 21:44:58.607980: I tensorflow/compiler/jit/xla_gpu_device.cc:99] Not creating XLA devices, tf_xla_enable_xla_devices not set 2021-05-09 21:44:59.448109: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:116] None of the MLIR optimization passes are enabled (registered 2) 2021-05-09 21:44:59.466172: I tensorflow/core/platform/profile_utils/cpu_utils.cc:112] CPU Frequency: 1800000000 Hz farmer : 100.0 waiter : 0.0 police : 0.0 pilot : 0.0 mechanic : 0.0 "

ekesdf commented 3 years ago

Plz comment your Program you used for the Detection and the image created by image ai And did you trained the model or just run the detection ??

amallais commented 3 years ago

I have the same issue.

firefighter : 100.0 waiter : 0.0 police : 0.0 pilot : 0.0 mechanic : 0.0

Dataset I used is the idenprof pictures.

I built the model with the following example script custom_model_training.py

from imageai.Classification.Custom import ClassificationModelTrainer

model_trainer = ClassificationModelTrainer()
model_trainer.setModelTypeAsResNet50()
model_trainer.setDataDirectory("idenprof")
model_trainer.trainModel(num_objects=10, num_experiments=200, enhance_data=True, batch_size=32, show_network_summary=True)

I tried the prediction with the following example script custom_model_prediction.py

from imageai.Classification.Custom import CustomImageClassification
import os

execution_path = os.getcwd()

prediction = CustomImageClassification()
prediction.setModelTypeAsResNet50()
prediction.setModelPath(os.path.join(execution_path, "model.h5")) # Download the model via this link https://github.com/OlafenwaMoses/ImageAI/releases/tag/models-v3
prediction.setJsonPath(os.path.join(execution_path, "model_class.json"))
prediction.loadModel(num_objects=10)

predictions, probabilities = prediction.classifyImage(os.path.join(execution_path, "image.jpg"), result_count=5)

for eachPrediction, eachProbability in zip(predictions, probabilities):
    print(eachPrediction , " : " , eachProbability)

using a python venv with the following pip packages ; absl-py 0.12.0 astunparse 1.6.3 cachetools 4.2.2 certifi 2020.12.5 chardet 4.0.0 cycler 0.10.0 flatbuffers 1.12 gast 0.3.3 google-auth 1.30.0 google-auth-oauthlib 0.4.4 google-pasta 0.2.0 grpcio 1.32.0 h5py 2.10.0 idna 2.10 imageai 2.1.6 Keras 2.4.3 Keras-Preprocessing 1.1.2 keras-resnet 0.2.0 kiwisolver 1.3.1 Markdown 3.3.4 matplotlib 3.3.2 numpy 1.19.3 oauthlib 3.1.0 opencv-python 4.5.2.52 opt-einsum 3.3.0 Pillow 7.0.0 pip 20.1.1 pkg-resources 0.0.0 protobuf 3.17.0 pyasn1 0.4.8 pyasn1-modules 0.2.8 pyparsing 2.4.7 python-dateutil 2.8.1 PyYAML 5.4.1 requests 2.25.1 requests-oauthlib 1.3.0 rsa 4.7.2 scipy 1.4.1 setuptools 44.0.0 six 1.15.0 tensorboard 2.5.0 tensorboard-data-server 0.6.1 tensorboard-plugin-wit 1.8.0 tensorflow-estimator 2.4.0 tensorflow-gpu 2.4.0 termcolor 1.1.0 typing-extensions 3.7.4.3 urllib3 1.26.4 Werkzeug 2.0.0 wheel 0.36.2 wrapt 1.12.1

I'm pretty sure I am following the tutorial, but I could be making a mistake somewhere, any ideas ?

ekesdf commented 3 years ago

The problem is that you can't just train the model for just 200 epochs and think that it will be perfect. It's just the amount of epoch you have to train a model especially when you have many different classes that is relatively big like 500 or even a couple of thousand epochs with a real good dataset

amallais commented 3 years ago

I think I understand what you mean. But I thought it would be giving 100% prediction, but more of a spread out between the different classes.

So if I let it run a couple of thousands of epoch it should be good in your opinion ?

I just wanted to test the tutorial and I wasn't sure if I was going to a good a direction. I went with 200 epochs, because the examples file had 200 epochs.

ekesdf commented 3 years ago

Well, try the pre-trained it should work just fine. Can you tell me how high is your loss so I can make rough prediction how much epoch you have to train your custom model more to get a good result

amallais commented 3 years ago

I've downloaded the model and json from here : https://github.com/OlafenwaMoses/ImageAI/releases/tag/essentials-v5

and got the same prediction (100% wrong one)

farmer : 100.0 waiter : 0.0 police : 0.0 pilot : 0.0 mechanic : 0.0

from imageai.Classification.Custom import CustomImageClassification
import os

execution_path = os.getcwd()

prediction = CustomImageClassification()
prediction.setModelTypeAsResNet50()
prediction.setModelPath(os.path.join(execution_path, "idenprof_resnet_ex-056_acc-0.993062.h5")) # Download the model via this link https://github.com/OlafenwaMoses/ImageAI/releases/tag/models-v3
prediction.setJsonPath(os.path.join(execution_path, "idenprof.json"))
prediction.loadModel(num_objects=10)

predictions, probabilities = prediction.classifyImage(os.path.join(execution_path, "image.jpg"), result_count=5)

for eachPrediction, eachProbability in zip(predictions, probabilities):
    print(eachPrediction , " : " , eachProbability)

Here is a sample output of the training process ;

281/281 [==============================] - 190s 425ms/step - loss: 2.2515 - accuracy: 0.3288 - val_loss: 2.6181 - val_accuracy: 0.1951

Epoch 00001: accuracy improved from -inf to 0.39507, saving model to idenprof/models/model_ex-001_acc-0.395071.h5
Epoch 2/200
281/281 [==============================] - 48s 170ms/step - loss: 1.4591 - accuracy: 0.4869 - val_loss: 1.7920 - val_accuracy: 0.4017

Epoch 00002: accuracy improved from 0.39507 to 0.50914, saving model to idenprof/models/model_ex-002_acc-0.509144.h5
Epoch 3/200
281/281 [==============================] - 48s 169ms/step - loss: 1.2989 - accuracy: 0.5587 - val_loss: 2.0335 - val_accuracy: 0.4753

Epoch 00003: accuracy improved from 0.50914 to 0.55408, saving model to idenprof/models/model_ex-003_acc-0.554081.h5
Epoch 4/200
281/281 [==============================] - 48s 170ms/step - loss: 1.2052 - accuracy: 0.5902 - val_loss: 2.1017 - val_accuracy: 0.4471

Epoch 00004: accuracy improved from 0.55408 to 0.59110, saving model to idenprof/models/model_ex-004_acc-0.591102.h5
Epoch 5/200
281/281 [==============================] - 48s 169ms/step - loss: 1.1413 - accuracy: 0.6058 - val_loss: 1.7002 - val_accuracy: 0.4965

Epoch 00005: accuracy improved from 0.59110 to 0.60671, saving model to idenprof/models/model_ex-005_acc-0.606713.h5
Epoch 6/200
281/281 [==============================] - 48s 169ms/step - loss: 1.0700 - accuracy: 0.6298 - val_loss: 1.5535 - val_accuracy: 0.5050

Epoch 00006: accuracy improved from 0.60671 to 0.63147, saving model to idenprof/models/model_ex-006_acc-0.631467.h5
ekesdf commented 3 years ago

Well, the from point of the training output it looks very good like loss = 1.x inst the best but should work. The main thing about classification is that even with a 99.9 percent accuracy the is still a chance of getting the wrong to this picture and with 63 percent accuracy of your model the chance is much bigger to get the wrong label

As an advice train the model until your loss is something like 0.7-0.5 or you accuracy is high enough for something like 90-95 percent will be fine. But don't be afraid if the training progress starts to slow down with a lower loss it is normal

amallais commented 3 years ago

I'll let it train for a while, and let you know. Thank you very much for your time :)

ekesdf commented 3 years ago

I wish you luck and I rly think it will work :)

amallais commented 3 years ago

Ok even after the model rebuild from the weekend, still got the same issue.

But I tried with another mode algorithmn and it seem to work the way it's meant to be.

The issue is clearly with ResNet50. I even found a closed issue telling us not to use ResNet50 : https://github.com/OlafenwaMoses/ImageAI/issues/641

city535369abcd commented 3 years ago

I have the same problem with ResNet50 and my model is acc-0.995094. The reslut always showed as follows: police : 100.0 waiter : 0.0 pilot : 0.0 mechanic : 0.0 judge : 0.0

The prove is as follows imageurl.

https://res.cloudinary.com/dj330whg6/image/upload/v1623374997/res_f5v2rk.jpg

amallais commented 3 years ago

As I said, take a look here : https://github.com/OlafenwaMoses/ImageAI/issues/641

Resnet50 is broken with tensorflow 2.4