seantempesta commented 6 years ago

I'm trying to convert a Darknet Yolo v2 model to Keras and then to CoreML using Apple's coremltools: https://github.com/apple/coremltools/

This procedure apparently used to work according to this tutorial: https://github.com/hollance/YOLO-CoreML-MPSNNGraph

I'm kind of a noob, but from what I understand Lambda layers just allow you to run arbitrary code (which is unsurprisingly not supported by Apple's API). It looks like this is where this is happening:

yad2k.py

elif section.startswith('reorg'):
            block_size = int(cfg_parser[section]['stride'])
            assert block_size == 2, 'Only reorg with stride 2 supported.'
            all_layers.append(
                Lambda(
                    space_to_depth_x2,
                    output_shape=space_to_depth_x2_output_shape,
                    name='space_to_depth_x2')(prev_layer))
            prev_layer = all_layers[-1]

Is there a way to do space_to_depth with the keras API so the conversion is supported? I'm really out of my depth (pun intended) here and don't really understand what's going on. Any help would be appreciated. :)

seantempesta commented 6 years ago

So, this seems like a hack (not a real solution), but since it solved my problem I'll post it and close the issue.

Out of the full yolo v2 configuration documented here: https://github.com/AlexeyAB/darknet#how-to-train-to-detect-your-custom-objects

If you simply delete the reorg section it seems to work. Not sure what damage this does to the algorithm, but the model I recently trained was really accurate so it can't be that bad.

ldocao commented 6 years ago

Hi @seantempesta , I'm facing the exact same problem. When you say "delete the reorg section", I first naively tried to delete the [reorg] section in the yolo.cfg file. But then, of course there is a size mismatch between layers. Can you please point me toward a list of steps to remove the reorg layer ?

seantempesta commented 6 years ago

Actually, it looks like I did delete more than just the reorg section:

203,211d202
< [route]
< layers=-9
< 
< [reorg]
< stride=2
< 
< [route]
< layers=-1,-3
<

So the route layers above and below it too? Keep in mind I have no idea what I'm doing and this could potentially be bad, but I got a working model that seemed pretty accurate.

seantempesta commented 6 years ago

Also, in case you decide the Full Yolo V2 model is too slow and want to use the Tiny model, you need to train using the Tiny weights. I found quite a bit of misinformation on this (other guides saying to use darknet19_448.conv.23 weights, but there are too many layers in it and even though it works in Darknet, once you convert to Keras you'll get 0 detections.

Instead I trained with the tiny-yolo-voc.weights and that worked really well. I just followed this guide: https://github.com/AlexeyAB/darknet#how-to-train-to-detect-your-custom-objects

And then this tutorial is amazing: https://github.com/hollance/YOLO-CoreML-MPSNNGraph

I pretty much had to do everything exactly as he did (including using his modified version of YAD2K) to get the model to convert to Keras and then to CoreML.

ldocao commented 6 years ago

Thanks for your quick answer! I don't necessarily need the speed of tiny YOLO for my case, so I can stay with the full YOLO v2 for accuracy. I have few questions about your modifications:

1/ when you say "pretty accurate", did you run the model over the COCO dataset and check the exact value of mAP ? Or do you mean it predicts something "reasonable"? 2/ in the cfg file downloaded from https://github.com/pjreddie/darknet/blob/master/cfg/yolo.cfg, in the last route section, layers=-1, -4 (as opposed to -1, -3 in your code). is it a copy/paste error ? Or did you have a different original cfg? 3/ did you have any reason to remove layers=-9 ? as you don't need to reorg the layer, it looks like it was not necessary.

pchensoftware commented 6 years ago

If you want to reuse the existing Yolo 2 network and weights without retraining, another option that I just got working, but is a bit involved, is roughly:

Use the existing YAD2K steps to convert the yolo cfg and weights into an h5 file.
Clone the coremltools repo and follow the steps to build from source.
Add the coremltools repo top-level directory to your PYTHONPATH.
Run python 3 interpreter and import coremltool to check that it's working.
Run your coremltools keras to coreml convert script
See the coremltools fail in some file with an error about Lambda's not being supported
Hack the coremltools file that failed by tracing it's use of conversion methods, and find the correct place to use coreml's add_reorganize_data() method.

That last part is the tricky part and will take some experimenting and re-running your coreml conversion script.

https://apple.github.io/coremltools/generated/coremltools.models.neural_network.html

seantempesta commented 6 years ago

@ldocao: Regarding your questions:

1/ when you say "pretty accurate", did you run the model over the COCO dataset and check the exact value of mAP ? Or do you mean it predicts something "reasonable"?

I trained it for a new object class and found it to work well for my application. I believe it had a 99% Recall rate and an IOU=88%.

2/ in the cfg file downloaded from https://github.com/pjreddie/darknet/blob/master/cfg/yolo.cfg, in the last route section, layers=-1, -4 (as opposed to -1, -3 in your code). is it a copy/paste error ? Or did you have a different original cfg?

The config I'm using was based on yolo-voc.2.0.cfg. Here's the full diff (diff cfg/custom-full-yolo-v2.cfg cfg/yolo-voc.2.0.cfg)

202a203,211
> [route]
> layers=-9
> 
> [reorg]
> stride=2
> 
> [route]
> layers=-1,-3
> 
215c224
< filters=30
---
> filters=125
219c228
< anchors = 1.82,0.74, 3.09,1.66, 4.04,2.46, 4.87,3.19, 6.32,4.36
---
> anchors = 1.08,1.19,  3.42,4.41,  6.63,11.38,  9.42,5.11,  16.62,10.52
221c230
< classes=1
---
> classes=20

3/ did you have any reason to remove layers=-9 ? as you don't need to reorg the layer, it looks like it was not necessary.

Yeah, I probably didn't do this right. Just reporting what worked for me.

pcmanik commented 6 years ago

@pchensoftware Hi Peter, Can you provide more informations about the last part - "hack the coremltools" please? Also if you can upload the converted model it will really help me and others. I'm working on school project right now where I want to use full YOLO but now I'm stuck at tiny YOLO.

eirikhollis commented 6 years ago

@pcmanik Did you eventually manage to solve the issue? Perhaps @pchensoftware could provide some further details?

pcmanik commented 6 years ago

@eirikhollis No progress here. Waiting for response from @pchensoftware

ArthurOuaknine commented 6 years ago

A quick and dirty solution can be applied directly on the cfg file to remove the reorg layer.

The reorg layer in the keras_yolo.py code uses the space_to_depth function of tensorflow. It moves extra data in height and width into the depth. It reduces the height and width dimensions without lossing information.

In the cfg file, the part to modify is the following:

[reorg]                                                                                                            
stride=2                                                                                                           

[route]                                                                                                            
layers=-1,-4

It resizes the convolutional layer just above in the code with the space_to_depth function (from 38x38x64 to 19x19x256) and concatenates it with a previous convolutional layer (19x19x1024) to produce an output of size 19x19x1280.

It is possible to replace these 4 lines of code with:

[maxpool]
size=2
stride=2

[route]
layers=-2

[maxpool]
size=2
stride=2

[route]
layers=-4

[maxpool]
size=2
stride=2

[route]
layers=-6

[maxpool]
size=2
stride=2

[route]
layers=-1,-3,-5,-7

[route]
layers=-1, -11

It will transform the output of the convolutional layer (38x38x64) into a smaller one (19x19x64). This output is duplicated 4 times and they are concatenated to match the correct shape (19x19x256). Thus the final route concatenates the stacked features maps with the previous convolutional layer to produce an output with the correct shape (19x19x1280). EDIT: the Keras to CoreML converter doesn't allow [route] with the same name of layer multiple times, thus it is not possible to just add a layers=-1,-1,-1,-1 instead of the four independant routes.

If you don't match the correct shapes, the trained weights won't be at the corresponding layers and it will produce wrong results.

Using the specific postprocessing presented here with a classification confidence threshold of 0.3 and an IoU threshold of 0.4, the results are not as good as before but they look correct over the images.

Using the 2017 COCO validation dataset and the Python COCO API to compute the mAP scores, we obtain the following results before removing the reorg layer: Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.195 Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.421 Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.154

Few points are lost because we should not use duplicates of the maxpooling layer, it needs to be the entire 38x38x64 layer but reshaped as 19x19x256. I couldn't find a quick solution to solve this just by modifing the cfg file.

A longer solution is to construct the entire graph in Tensorflow (or another framework) with a reshape tensor at the needed place (between batch_normalization_20 and conv2d_22). Then you load the weights to perfectly match the graph and you export it as a protobuf file. Finally you convert the protobuf file as a keras model to convert it into a CoreML model.

pcmanik commented 6 years ago

Thank you @ArthurOuaknine I have succesfully converted YOLO to Keras with your instructions. But converting from Keras to CoreML ended with error. See image please.

Am I doing anything wrong?

Here is the code for converting the model. coreml_model = coremltools.converters.keras.convert( 'yad2k/model_data/yolo-voc.h5', input_names='image', image_input_names='image', output_names='grid', image_scale=1/255.)

ArthurOuaknine commented 6 years ago

I had the same problem than you @pcmanik. Have you used the following route ?

[route]
layers=-1,-1,-1,-1

CoreML doesn't allow to have the same name for multiple inputs. I think that the converter understand that there is a single input. I have made an edit in my previous comment where I have changed the code to replace.

pcmanik commented 6 years ago

@ArthurOuaknine Thank you. You did an amazing job there. Finally after your edit I get it working. The resulting CoreML model is working and its big improvement over the TinyYOLO so your "quick and dirty solution" is really good.

My really big thanks to you! :)

eirikhollis commented 6 years ago

@ArthurOuaknine Much appreciation from me as well. Finally got it to work!

keless commented 6 years ago

I'm a programmer but a newb at ML, so after seeing how inaccurate TinyYOLO was on a phone I wanted to find a pre-made YOLOv2 .mlmodel >.<

Has anyone put their resulting .mlmodel of their YOLOv2 anywhere that is accessible?

pcmanik commented 6 years ago

@keless Write me email.

pcmanik commented 6 years ago

Just contact me on mail I will send you the .mlmodel.

lamerman commented 6 years ago

In general @pchensoftware described everything correctly, but I would make a bit more detailed explanation for those who want to convert the network theirselves.

I made it all on Linux, maybe there are some corrections needed for mac or windows. There are a lot of dirty hacks. I assume you want it to just fucking work by making a kludge in the heart of your code and not make a proper solution with tests, CI, agile etc.. If so, then:

Darknet to Keras

So you have trained a full yolo and want to run it on coreml. First, convert it to Keras format using

./yad2k.py yolo.cfg yolo.weights yolo-voc.h5

The name yolo here is just for example, replace it with the name of your weight and config files.

If you trained your own network it's possible that it has slightly different header format for the .weights file. You can notice it by looking at the output of yad2k.py.

In the end it writes Read 50768311 of 50768311.0 from Darknet weights. If the number mismatches by one, then probably your model is using different header format of .weights where length is 64 bits instread of previous 32. If so and only if, apply the following patch to yad2k.py:

Replace shape=(4, ), dtype='int32', buffer=weights_file.read(16)) with shape=(5, ), dtype='int32', buffer=weights_file.read(20))

In my case as you can see the numbers equal and nothing has to be done.

In the end you should have your yolo-voc.h5 converted. Try it with

./test_yolo.py --anchors_path ... --classes_path ... model_data/yolo-voc.h5

In the images/out you should see images with detected objects.

Keras to CoreMl

First of all get the source of coremltools, you will need to patch it. Build it from source using the instructions on their site.

Create a file convert.py in the directory where you cloned the source with this content:

#!/usr/bin/env python
from coremltools._scripts.converter import _main

_main()

This is needed just to run the conversion. Maybe there is some other way, who knows. So finally, you will run all of this with this command:

./convert.py --srcModelPath yolo-voc.h5 --dstModelPath yolo-voc.mlmodel --inputNames image --imageInputNames image --outputNames grid --scale 0.00392156885937

The scale parameter is important, don't forget to set it.

But before you can run it first let's correct something. First of all coremltools are available only for python2. If you know how to run it with python3 don't do what I describe in the Patching Keras chapter, for me it was easier to kludge the code so it just works.

Patching Coremltools

Make a virtualenv virtualenv -p python2 --system-site-packages venv2

Activate it source venv2/bin/activate

Install some libs

pip install h5py==2.7.1
pip install keras==2.0.6

Then patch the main code. Here is the diff:

diff --git a/coremltools/converters/keras/_keras2_converter.py b/coremltools/converters/keras/_keras2_converter.py
index 530c8bf..8e3cc95 100644
--- a/coremltools/converters/keras/_keras2_converter.py
+++ b/coremltools/converters/keras/_keras2_converter.py
@@ -69,6 +69,8 @@ if _HAS_KERAS2_TF:

         _keras.applications.mobilenet.DepthwiseConv2D:_layers2.convert_convolution,

+        _keras.layers.core.Lambda: _layers2.convert_reorganize,
+
     }

diff --git a/coremltools/converters/keras/_layers2.py b/coremltools/converters/keras/_layers2.py
index 01d2bdd..900af43 100644
--- a/coremltools/converters/keras/_layers2.py
+++ b/coremltools/converters/keras/_layers2.py
@@ -866,6 +866,12 @@ def convert_reshape(builder, layer, input_names, output_names, keras_layer):
     else:
         _utils.raise_error_unsupported_categorical_option('input_shape', str(input_shape), 'reshape', layer)

+def convert_reorganize(builder, layer, input_names, output_names, keras_layer):
+
+    input_name, output_name = (input_names[0], output_names[0])
+
+    builder.add_reorganize_data(name = layer, input_name = input_name, output_name=output_name, block_size=2)
+
 def convert_simple_rnn(builder, layer, input_names, output_names, keras_layer):
     """
     Convert an SimpleRNN layer from keras to coreml.

Patching Keras

Don't do it if you have coremltools running in python3, only if you have out of box version with python 2.

You also need to patch the Keras code a bit. Edit venv2/local/lib/python2.7/site-packages/keras/utils/generic_utils.py, add there two functions

def space_to_depth_x2(x):
    """Thin wrapper for Tensorflow space_to_depth with block_size=2."""
    # Import currently required to make Lambda work.
    # See: https://github.com/fchollet/keras/issues/5088#issuecomment-273851273
    import tensorflow as tf
    return tf.space_to_depth(x, block_size=2)

def space_to_depth_x2_output_shape(input_shape):
    """Determine space_to_depth output shape for block_size=2.

    Note: For Lambda with TensorFlow backend, output shape may not be needed.
    """
    return (input_shape[0], input_shape[1] // 2, input_shape[2] // 2, 4 *
            input_shape[3]) if input_shape[1] else (input_shape[0], None, None,
                                                    4 * input_shape[3])

and instead of marshal.loads(code.encode('raw_unicode_escape'))

put

    if len(code) == 422:
        code = space_to_depth_x2.__code__
    else:
        code = space_to_depth_x2_output_shape.__code__

It's possible that len is different in your case, maybe not, remember this is a very dirty hack that works only on one PC. In general this code is executed twice and for the first time you have to pass code = space_to_depth_x2.__code__, for the second code = space_to_depth_x2_output_shape.__code__. How you will do it is up to you.

Enjoy!

./convert.py --srcModelPath yolo-voc.h5 --dstModelPath yolo-voc.mlmodel --inputNames image --imageInputNames image --outputNames grid --scale 0.00392156885937

After that I can run my full YOLO on iphone.

tkreiman commented 6 years ago

@ArthurOuaknine I tried your solution and have a keras .h5 model. However, when I try to convert it to coreml I get the following error:

ValueError: Keras layer '<class 'keras.legacy.layers.Merge'>' not supported.

Do I have to add a custom layer? How do I avoid this problem?

ArthurOuaknine commented 6 years ago

@tkreiman I am sorry but I never had this error. Obviously CoreML doesn't support the Merge layer, so you have to modify the structure of your model to only have supported layers. I think you should replace the Merge layer with a Concatenate layer. If you are working on YOLO, I suggest you to modify directly the .cfg file before converting to Keras (the structure is clear and easy to modify).

tkreiman commented 6 years ago

@ArthurOuaknine OK, I replaced the merge layer with Concatenate and it works now, thanks.

allanzelener / YAD2K

CoreML conversion fails due to Lambda layer #80

Darknet to Keras

Keras to CoreMl

Patching Coremltools

Patching Keras

Enjoy!