tobybreckon / fire-detection-cnn

real-time fire detection in video imagery using a convolutional neural network (deep learning) - from our ICIP 2018 paper (Dunnings / Breckon) + ICMLA 2019 paper (Samarth / Bhowmik / Breckon)
MIT License
534 stars 171 forks source link

Please add some validation images to the project #36

Closed yurivict closed 4 years ago

yurivict commented 4 years ago

Since there are some variables like 0..255 input value range (vs 0..1), potential image resizing issues, and the presence of some training logic in the resulting model, it would be useful to have some images with precisely known expected outcomes.

So if you could add several images of sizes 224x224 with corresponding outcomes [File,NoFire] this would be very useful to people who are trying to run the net to validate their ways of running it.

Thank you, Yuri

tobybreckon commented 4 years ago

Thanks for your comments.

We in fact provide a 10.5Gb dataset of images for download, for which the subset used for training/test is clearly stated in the corresponding README.txt in the download. The corresponding performance rates are provided in the corresponding paper, and this is the largest fire image dataset available globally as of right now - what more do you want?

Run the model over the images, do you get the same accuracy.

yurivict commented 4 years ago

Somehow I can't validate the model with your images.

For example:

$ label_image --tflite_model firenet.tflite --image FireDataTwo4333.bmp --labels labels.txt
Loaded model firenet.tflite
resolved reporter
INFO: Initialized TensorFlow Lite runtime.
invoked 
average time: 86.082 ms 
0.999985: 1 No Fire

There is a fire in the image.

I use this TF-Lite model: https://people.freebsd.org/~yuri/firenet.tflite It was converted from the TF model: https://people.freebsd.org/~yuri/firenet.pb label_image is built from the TensorFlow project.

Either the conversion is wrong, or the input range is wrong, hard to tell.

tobybreckon commented 4 years ago

Have you checked the PAQ issue post on the one hot encoding definition we used ... ... when you compare it to your labels.txt

This would appear to be inverted in your code.

As, again, we say in the PAQ issue post our inputs are 0-255, not rescaled.

Run over the entire dataset, if the accuracy is inverted from what we reported then your labelling is probably inverted.

yurivict commented 4 years ago

This would appear to be inverted in your code.

My labels.txt:

Fire
No Fire

is according to PAQ:

Q: How were the labels encoded during training ?
A: The one-hot encoding was made alphabetically with 'fire' being the first class (0) and 'nofire' the second (1) such that:

    fire= (1,0)
    nofire = (0,1)
tobybreckon commented 4 years ago

In which case just run over the set of test examples outlined in the README.txt doc with the dataset.

If you get the same accuracy - all ok.

If you get 1-accuracy or similar - labels are inverted (does tf lite encode line 1 of labels as 1,0 (i.e. 10) alphabetically left to right or encode them starting from smallest binary representation upwards as 01, aka 0,1, for the first line entry ? - check this)

If accuracy is completely different, ... your model conversion is wrong somehow.

Hope this helps.

yurivict commented 4 years ago

You should publish the models suitable for computation. .tflite in particular.

tobybreckon commented 4 years ago

Thanks for your suggestion.

This is exactly why we publish the frozen graph .pb protocolbuf version (which you claim had all the extra nodes in it - but that were not visible in TensorFlow/Tensorboard). This is specifically a inference (computation) format.

Converting from this standard tf .pb format to .tflite or other optimized edge computation formats is a fairly standard step, and not very difficult.

It is not possible to publish the models in every possible format for every framework. We publish them in the original framework format, and a convertor to one of the most common formats. We recommend the MMNet convertor project for conversions beyond this.

yurivict commented 4 years ago

which you claim had all the extra nodes in it - but that were not visible in TensorFlow/Tensorboard

Please open it in netron (https://github.com/lutzroeder/netron) and observe this for yourself. The resulting model also has ransom number generators, switches, etc.

tobybreckon commented 4 years ago

... but as I pointed out before the support for this format is only experimental in netron whilst they are not shown in TensorFlow/TensorBoard which is in fact reading this format correctly, as will tensorflow, tflite etc.

netron is not showing the actual model as loaded by TensorFlow or tools which fully support the protocolbuf format used by TF.

Assuming you managed to translate this model to tflite and it works - then the const, random number generators, switches must not be present or the model would not work properly.

yurivict commented 4 years ago

No, from my perspective FireNet does not work - I was not able to validate it, it always returns "No Fire" no matter what image is supplied.

tobybreckon commented 4 years ago

This is probably because the default code for the example you are using at:

https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/examples/label_image

is subtracting the default mean and rescaling by the default standard deviation for all input pixels.

As both of these are set to 127.5 by default (in one of the header files at this url), this will result in an input pixel range of -1->1, not 0->255 as required by FireNet.

As a result all the pixels will be very dark in terms of colour (-1/0->1) compared to what the network was trained on (0->255), and as a result it will always classify the image as not fire.

In that particular TensorFlow example, if you use the additional available command line options to specify a zero input mean, and standard deviation of 1 I think the code will not perform any rescaling/substraction on the pixels. In which case you should get the results you expect, as the network is now getting the pixel range it expects.

I hope this helps.

tobybreckon commented 4 years ago

Can you please confirm if this solved your issue ?

yurivict commented 4 years ago

Can you please confirm if this solved your issue ?

No, from my perspective FireNet doesn't work (it always says "NoFire').

tobybreckon commented 4 years ago

OK, so I have looked at this further ...

In the branch available at: https://github.com/tobybreckon/fire-detection-cnn/tree/tflite I have added a conversion to .tflite for firenet and also a validation program that now shows, for the test video provided, that the output of all three models (tflearn, .pb, .tflife) is the same for all frames in the video.

To use (do a pull, and checkout this branch - not master):

cd fire-detection-cnn
git pull
git checkout tflite
cd converter
python firenet-conversion.py

which should now additionally produce firenet.tflite in the convertor directory.

To validate try:

python firenet-validation.py

which runs each model format on the test video and should then produce ...

Load tflearn model from: ../models/FireNet ...OK
Load protocolbuf (pb) model from: firenet.pb ...OK
Load protocolbuf (pb) model from: firenet.tflite ...OK
Load test video from ../models/test.mp4 ...
frame: 0        : TFLearn (original): [[9.999914e-01 8.576833e-06]]     : Tensorflow .pb (via opencv): [[9.999914e-01 8.576866e-06]]    : TFLite (via tensorflow): [[9.999914e-01 8.576899e-06]]: all equal test - PASS
frame: 1        : TFLearn (original): [[9.999924e-01 7.609045e-06]]     : Tensorflow .pb (via opencv): [[9.999924e-01 7.608987e-06]]    : TFLite (via tensorflow): [[9.999924e-01 7.608980e-06]]: all equal test - PASS
frame: 2        : TFLearn (original): [[9.999967e-01 3.373572e-06]]     : Tensorflow .pb (via opencv): [[9.999967e-01 3.373559e-06]]    : TFLite (via tensorflow): [[9.999967e-01 3.373456e-06]]: all equal test - PASS
frame: 3        : TFLearn (original): [[9.999968e-01 3.165212e-06]]     : Tensorflow .pb (via opencv): [[9.999968e-01 3.165221e-06]]    : TFLite (via tensorflow): [[9.999968e-01 3.165176e-06]]: all equal test - PASS
...

The README in this branch has been updated but does not include the checkout step, pending a pull request.

Back to your original issue - I suspect that the problem is that the input image has pixels in the wrong range (0->1, instead of 0->255 - due to issue in earlier post) and is possibly in RGB rather than BGR channel ordering (with the latter being the default for all things OpenCV - see here).

The above provides a working .tflite model, working on the same images and giving the same output to the 7 decimal place (for PASS/FAIL). I hope this helps.

tobybreckon commented 4 years ago

Some more experimentation with a GPU variant seems to show I the same output to a numerical precision of 5 (for PASS/FAIL) for the whole video for all three models, and hence the thresholded fire/no-fire decisions always match - I have amended the code to show this (in this branch). Please let me know if you concur.

yurivict commented 4 years ago

possibly in RGB rather than BGR channel ordering

This was the culprit. Once I swapped channels the network works beautifully.


I recommend you publish firenet NN in common formats, .tflow, .pb, .onnx, so that it would be easy for people to run. Python scripts that generate the files would become obsolete and not runnable after a few TF releases, and people wouldn't be able to generate the files themselves any more. File formats are easier, they are just ProtoBufs and FlatBufs.


You should also emphasize that 0..255 input normalization and BGR color encoding are expected. Or just reformat the net to be 0..1/RGB.


The files I ran are: https://people.freebsd.org/~yuri/firenet.tflite https://people.freebsd.org/~yuri/firenet.pb I had to manually get rid of the 'training' boolean variable to be able to run it.

tobybreckon commented 4 years ago

Good to know it is working.

We'll take what you say in the models onboard - it is a trade-off between providing the original and maintaining convertors which will then always then provide a net output that is compatible with the latest TF release, or providing a set of file format variants which then have to be maintained as being compatible with latest TF release. With the possible exception of ONNX, many of these appear not to be stable on themselves.

In the new tflite branch we address the tflite export issue and the training variables - still to be pulled to master.

The input format requirements are now clearly in the PAQ, thanks for catching this.

It is unlikely we will retrain these nets at present, we have a new set of architectures making their debut soon.

Many thanks.