Closed seantempesta closed 6 years ago
So, this seems like a hack (not a real solution), but since it solved my problem I'll post it and close the issue.
Out of the full yolo v2 configuration documented here: https://github.com/AlexeyAB/darknet#how-to-train-to-detect-your-custom-objects
If you simply delete the reorg section it seems to work. Not sure what damage this does to the algorithm, but the model I recently trained was really accurate so it can't be that bad.
Hi @seantempesta , I'm facing the exact same problem. When you say "delete the reorg section", I first naively tried to delete the [reorg] section in the yolo.cfg file. But then, of course there is a size mismatch between layers. Can you please point me toward a list of steps to remove the reorg layer ?
Actually, it looks like I did delete more than just the reorg section:
203,211d202
< [route]
< layers=-9
<
< [reorg]
< stride=2
<
< [route]
< layers=-1,-3
<
So the route layers above and below it too? Keep in mind I have no idea what I'm doing and this could potentially be bad, but I got a working model that seemed pretty accurate.
Also, in case you decide the Full Yolo V2 model is too slow and want to use the Tiny model, you need to train using the Tiny weights. I found quite a bit of misinformation on this (other guides saying to use darknet19_448.conv.23
weights, but there are too many layers in it and even though it works in Darknet, once you convert to Keras you'll get 0 detections.
Instead I trained with the tiny-yolo-voc.weights
and that worked really well. I just followed this guide:
https://github.com/AlexeyAB/darknet#how-to-train-to-detect-your-custom-objects
And then this tutorial is amazing: https://github.com/hollance/YOLO-CoreML-MPSNNGraph
I pretty much had to do everything exactly as he did (including using his modified version of YAD2K) to get the model to convert to Keras and then to CoreML.
Thanks for your quick answer! I don't necessarily need the speed of tiny YOLO for my case, so I can stay with the full YOLO v2 for accuracy. I have few questions about your modifications:
1/ when you say "pretty accurate", did you run the model over the COCO dataset and check the exact value of mAP ? Or do you mean it predicts something "reasonable"? 2/ in the cfg file downloaded from https://github.com/pjreddie/darknet/blob/master/cfg/yolo.cfg, in the last route section, layers=-1, -4 (as opposed to -1, -3 in your code). is it a copy/paste error ? Or did you have a different original cfg? 3/ did you have any reason to remove layers=-9 ? as you don't need to reorg the layer, it looks like it was not necessary.
If you want to reuse the existing Yolo 2 network and weights without retraining, another option that I just got working, but is a bit involved, is roughly:
That last part is the tricky part and will take some experimenting and re-running your coreml conversion script.
https://apple.github.io/coremltools/generated/coremltools.models.neural_network.html
@ldocao: Regarding your questions:
1/ when you say "pretty accurate", did you run the model over the COCO dataset and check the exact value of mAP ? Or do you mean it predicts something "reasonable"?
I trained it for a new object class and found it to work well for my application. I believe it had a 99% Recall rate and an IOU=88%.
2/ in the cfg file downloaded from https://github.com/pjreddie/darknet/blob/master/cfg/yolo.cfg, in the last route section, layers=-1, -4 (as opposed to -1, -3 in your code). is it a copy/paste error ? Or did you have a different original cfg?
The config I'm using was based on yolo-voc.2.0.cfg
. Here's the full diff (diff cfg/custom-full-yolo-v2.cfg cfg/yolo-voc.2.0.cfg
)
202a203,211
> [route]
> layers=-9
>
> [reorg]
> stride=2
>
> [route]
> layers=-1,-3
>
215c224
< filters=30
---
> filters=125
219c228
< anchors = 1.82,0.74, 3.09,1.66, 4.04,2.46, 4.87,3.19, 6.32,4.36
---
> anchors = 1.08,1.19, 3.42,4.41, 6.63,11.38, 9.42,5.11, 16.62,10.52
221c230
< classes=1
---
> classes=20
3/ did you have any reason to remove layers=-9 ? as you don't need to reorg the layer, it looks like it was not necessary.
Yeah, I probably didn't do this right. Just reporting what worked for me.
@pchensoftware Hi Peter, Can you provide more informations about the last part - "hack the coremltools" please? Also if you can upload the converted model it will really help me and others. I'm working on school project right now where I want to use full YOLO but now I'm stuck at tiny YOLO.
@pcmanik Did you eventually manage to solve the issue? Perhaps @pchensoftware could provide some further details?
@eirikhollis No progress here. Waiting for response from @pchensoftware
A quick and dirty solution can be applied directly on the cfg file to remove the reorg layer.
The reorg layer in the keras_yolo.py code uses the space_to_depth function of tensorflow. It moves extra data in height and width into the depth. It reduces the height and width dimensions without lossing information.
In the cfg file, the part to modify is the following:
[reorg]
stride=2
[route]
layers=-1,-4
It resizes the convolutional layer just above in the code with the space_to_depth function (from 38x38x64 to 19x19x256) and concatenates it with a previous convolutional layer (19x19x1024) to produce an output of size 19x19x1280.
It is possible to replace these 4 lines of code with:
[maxpool]
size=2
stride=2
[route]
layers=-2
[maxpool]
size=2
stride=2
[route]
layers=-4
[maxpool]
size=2
stride=2
[route]
layers=-6
[maxpool]
size=2
stride=2
[route]
layers=-1,-3,-5,-7
[route]
layers=-1, -11
It will transform the output of the convolutional layer (38x38x64) into a smaller one (19x19x64). This output is duplicated 4 times and they are concatenated to match the correct shape (19x19x256). Thus the final route concatenates the stacked features maps with the previous convolutional layer to produce an output with the correct shape (19x19x1280). EDIT: the Keras to CoreML converter doesn't allow [route] with the same name of layer multiple times, thus it is not possible to just add a layers=-1,-1,-1,-1 instead of the four independant routes.
If you don't match the correct shapes, the trained weights won't be at the corresponding layers and it will produce wrong results.
Using the specific postprocessing presented here with a classification confidence threshold of 0.3 and an IoU threshold of 0.4, the results are not as good as before but they look correct over the images.
Using the 2017 COCO validation dataset and the Python COCO API to compute the mAP scores, we obtain the following results before removing the reorg layer: Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.195 Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.421 Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.154
The results after the quick and dirty solution: Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.159 Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.368 Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.110
Few points are lost because we should not use duplicates of the maxpooling layer, it needs to be the entire 38x38x64 layer but reshaped as 19x19x256. I couldn't find a quick solution to solve this just by modifing the cfg file.
A longer solution is to construct the entire graph in Tensorflow (or another framework) with a reshape tensor at the needed place (between batch_normalization_20 and conv2d_22). Then you load the weights to perfectly match the graph and you export it as a protobuf file. Finally you convert the protobuf file as a keras model to convert it into a CoreML model.
Thank you @ArthurOuaknine I have succesfully converted YOLO to Keras with your instructions. But converting from Keras to CoreML ended with error. See image please.
Am I doing anything wrong?
Here is the code for converting the model.
coreml_model = coremltools.converters.keras.convert( 'yad2k/model_data/yolo-voc.h5', input_names='image', image_input_names='image', output_names='grid', image_scale=1/255.)
I had the same problem than you @pcmanik. Have you used the following route ?
[route]
layers=-1,-1,-1,-1
CoreML doesn't allow to have the same name for multiple inputs. I think that the converter understand that there is a single input. I have made an edit in my previous comment where I have changed the code to replace.
@ArthurOuaknine Thank you. You did an amazing job there. Finally after your edit I get it working. The resulting CoreML model is working and its big improvement over the TinyYOLO so your "quick and dirty solution" is really good.
My really big thanks to you! :)
@ArthurOuaknine Much appreciation from me as well. Finally got it to work!
I'm a programmer but a newb at ML, so after seeing how inaccurate TinyYOLO was on a phone I wanted to find a pre-made YOLOv2 .mlmodel >.<
Has anyone put their resulting .mlmodel of their YOLOv2 anywhere that is accessible?
@keless Write me email.
Just contact me on mail I will send you the .mlmodel.
In general @pchensoftware described everything correctly, but I would make a bit more detailed explanation for those who want to convert the network theirselves.
I made it all on Linux, maybe there are some corrections needed for mac or windows. There are a lot of dirty hacks. I assume you want it to just fucking work
by making a kludge in the heart of your code and not make a proper solution with tests, CI, agile etc.. If so, then:
So you have trained a full yolo and want to run it on coreml. First, convert it to Keras format using
./yad2k.py yolo.cfg yolo.weights yolo-voc.h5
The name yolo
here is just for example, replace it with the name of your weight and config files.
If you trained your own network it's possible that it has slightly different header format for the .weights
file. You can notice it by looking at the output of yad2k.py
.
In the end it writes Read 50768311 of 50768311.0 from Darknet weights.
If the number mismatches by one, then probably your model is using different header format of .weights
where length is 64 bits instread of previous 32. If so and only if, apply the following patch to yad2k.py
:
Replace
shape=(4, ), dtype='int32', buffer=weights_file.read(16))
with
shape=(5, ), dtype='int32', buffer=weights_file.read(20))
In my case as you can see the numbers equal and nothing has to be done.
In the end you should have your yolo-voc.h5
converted. Try it with
./test_yolo.py --anchors_path ... --classes_path ... model_data/yolo-voc.h5
In the images/out
you should see images with detected objects.
First of all get the source of coremltools, you will need to patch it. Build it from source using the instructions on their site.
Create a file convert.py in the directory where you cloned the source with this content:
#!/usr/bin/env python
from coremltools._scripts.converter import _main
_main()
This is needed just to run the conversion. Maybe there is some other way, who knows. So finally, you will run all of this with this command:
./convert.py --srcModelPath yolo-voc.h5 --dstModelPath yolo-voc.mlmodel --inputNames image --imageInputNames image --outputNames grid --scale 0.00392156885937
The scale
parameter is important, don't forget to set it.
But before you can run it first let's correct something. First of all coremltools are available only for python2. If you know how to run it with python3 don't do what I describe in the Patching Keras chapter, for me it was easier to kludge the code so it just works.
Make a virtualenv
virtualenv -p python2 --system-site-packages venv2
Activate it
source venv2/bin/activate
Install some libs
pip install h5py==2.7.1
pip install keras==2.0.6
Then patch the main code. Here is the diff:
diff --git a/coremltools/converters/keras/_keras2_converter.py b/coremltools/converters/keras/_keras2_converter.py
index 530c8bf..8e3cc95 100644
--- a/coremltools/converters/keras/_keras2_converter.py
+++ b/coremltools/converters/keras/_keras2_converter.py
@@ -69,6 +69,8 @@ if _HAS_KERAS2_TF:
_keras.applications.mobilenet.DepthwiseConv2D:_layers2.convert_convolution,
+ _keras.layers.core.Lambda: _layers2.convert_reorganize,
+
}
diff --git a/coremltools/converters/keras/_layers2.py b/coremltools/converters/keras/_layers2.py
index 01d2bdd..900af43 100644
--- a/coremltools/converters/keras/_layers2.py
+++ b/coremltools/converters/keras/_layers2.py
@@ -866,6 +866,12 @@ def convert_reshape(builder, layer, input_names, output_names, keras_layer):
else:
_utils.raise_error_unsupported_categorical_option('input_shape', str(input_shape), 'reshape', layer)
+def convert_reorganize(builder, layer, input_names, output_names, keras_layer):
+
+ input_name, output_name = (input_names[0], output_names[0])
+
+ builder.add_reorganize_data(name = layer, input_name = input_name, output_name=output_name, block_size=2)
+
def convert_simple_rnn(builder, layer, input_names, output_names, keras_layer):
"""
Convert an SimpleRNN layer from keras to coreml.
Don't do it if you have coremltools running in python3, only if you have out of box version with python 2.
You also need to patch the Keras code a bit. Edit venv2/local/lib/python2.7/site-packages/keras/utils/generic_utils.py
, add there two functions
def space_to_depth_x2(x):
"""Thin wrapper for Tensorflow space_to_depth with block_size=2."""
# Import currently required to make Lambda work.
# See: https://github.com/fchollet/keras/issues/5088#issuecomment-273851273
import tensorflow as tf
return tf.space_to_depth(x, block_size=2)
def space_to_depth_x2_output_shape(input_shape):
"""Determine space_to_depth output shape for block_size=2.
Note: For Lambda with TensorFlow backend, output shape may not be needed.
"""
return (input_shape[0], input_shape[1] // 2, input_shape[2] // 2, 4 *
input_shape[3]) if input_shape[1] else (input_shape[0], None, None,
4 * input_shape[3])
and instead of marshal.loads(code.encode('raw_unicode_escape'))
put
if len(code) == 422:
code = space_to_depth_x2.__code__
else:
code = space_to_depth_x2_output_shape.__code__
It's possible that len is different in your case, maybe not, remember this is a very dirty hack that works only on one PC. In general this code is executed twice and for the first time you have to pass code = space_to_depth_x2.__code__
, for the second code = space_to_depth_x2_output_shape.__code__
. How you will do it is up to you.
./convert.py --srcModelPath yolo-voc.h5 --dstModelPath yolo-voc.mlmodel --inputNames image --imageInputNames image --outputNames grid --scale 0.00392156885937
After that I can run my full YOLO on iphone.
@ArthurOuaknine I tried your solution and have a keras .h5 model. However, when I try to convert it to coreml I get the following error:
ValueError: Keras layer '<class 'keras.legacy.layers.Merge'>' not supported.
Do I have to add a custom layer? How do I avoid this problem?
@tkreiman I am sorry but I never had this error. Obviously CoreML doesn't support the Merge layer, so you have to modify the structure of your model to only have supported layers. I think you should replace the Merge layer with a Concatenate layer. If you are working on YOLO, I suggest you to modify directly the .cfg file before converting to Keras (the structure is clear and easy to modify).
@ArthurOuaknine OK, I replaced the merge layer with Concatenate and it works now, thanks.
I'm trying to convert a Darknet Yolo v2 model to Keras and then to CoreML using Apple's
coremltools
: https://github.com/apple/coremltools/This procedure apparently used to work according to this tutorial: https://github.com/hollance/YOLO-CoreML-MPSNNGraph
I'm kind of a noob, but from what I understand Lambda layers just allow you to run arbitrary code (which is unsurprisingly not supported by Apple's API). It looks like this is where this is happening:
yad2k.py
Is there a way to do
space_to_depth
with thekeras
API so the conversion is supported? I'm really out of my depth (pun intended) here and don't really understand what's going on. Any help would be appreciated. :)