Open sonia-auv-private opened 6 years ago
yes ofcourse. Just use the skripts train.py
and eval.py
provided by Tensorflow's Object Detection API like you would with any other model.
In stuff/ssd_mobilenet_checkpoints
you find the same checkpoint files i used, but they are the original ones provided by Tensorflow.
Thank you
Hi,
Just to clarify, I must train my Tensorflow Object Detection API only on 600x600px or 300x300px images in order for it to work with config file, and then place my trained ckpt file under stuff/ssd_mobilenet_checkpoints
and run your scripts as usual, is this correct?
Thanks so much.
Hey @uzbhutta,
I suggest you should first take a closer look at tensorflows original object detection API. Try to understand how training and inferencing works, which scripts are usable. And after that you take a look at my code and what it does.
To give you a short overview: It does not matter what size your images have that you train on as if you train with tfs api they will always be resized to a fixed size which you set in the config file. And this size is normally 300x300 for SSD. But you can ofcourse train a network on 600x600 if you like. But then you won’t be able to use a pretrained model as starting point as the weights are bound to the input dimensions that you train on.
So while training you get several checkpoints in an interval that you also set in the config.
And finally when you want to use my api to do inference, then you need to export one of those checkpoint files to a frozen model in the pb format.
This frozen model can then be included in my api and Adressen correctly in my config.yml.
And another thing: make sure to use my checkpoint files as starting point as my speed hack, the split model + multithreading only works if your model has the exact same layer names as mine.
I hope i could clearify some things for you.
Cheers Gustav
@GustavZ where is your checkpoint file? I trained on my own labeled data with tensorflow's object detection api _using your config file located in models/ssd_mobilenet_v11coco/. After training, I replace the frozen graph in models/ssd_mobilenet_v11_coco/.
When do inferencing, there comes an error:
ValueError: Node 'Preprocessor/map/TensorArray_2': Unknown input node 'Preprocessor/map/strided_slice'
I wonder _why my frozen graph has the Node 'Preprocessor/map/TensorArray2' but your frozen graph does not.
@David-Lee-1990 Which version of the model_zoo did you take? (which date is added at the end?) As Tensorflow seems to have changed some layer names in the newer version than the one i used (2017_11_17).
My checkpoint file is inside the model dir of ssd_mobilenet: https://github.com/GustavZ/realtime_object_detection/tree/master/models/ssd_mobilenet_v11_coco
With this checkpoint it should work, at least it did for my retrainings.
I hope i could help you!
@GustavZ I retrained my data using the configue file and the model.ckpt files in your model dir of ssd_mobilenet. But after that, I still encounter the same problem ( Node 'Preprocessor/map/TensorArray_2'). I wonder whether this is caused by the version difference of tensorflow? my tensorflow version is 1.8.
Yes pretty sure. There are so many changings during the version which lead to strange behavior and errors. I also keep switching versions all the time when I face errors.
Try tf 1.4 that’s where I started this project.
tf 1.4 is not available for training tensorflow's object detection api now for the 'AttributeError: module 'tensorflow.contrib.data' has no attribute 'parallel_interleave'.
I tried tf 1.5 to retrain the model, but the result graph still has the node 'Preprocessor/map/TensorArray_2'. This drives me crazy!
Hi @GustavZ,
First of all thanks for your work. It's really great.
However I have the same problem :/
Traceback (most recent call last):
File "...\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\framework\importer.py", line 489, in import_graph_def
graph._c_graph, serialized, options) # pylint: disable=protected-access
tensorflow.python.framework.errors_impl.InvalidArgumentError: Node 'Preprocessor/map/TensorArray_2': Unknown input node 'Preprocessor/map/strided_slice'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "...\realtime_object_detection-2.0\run_objectdetection.py", line 178, in <module>
config.NUM_CLASSES,config.SPLIT_MODEL, config.SSD_SHAPE).prepare_od_model()
File "...\realtime_object_detection-2.0\rod\model.py", line 157, in prepare_od_model
self.load_frozenmodel()
File "...\realtime_object_detection-2.0\rod\model.py", line 129, in load_frozenmodel
tf.import_graph_def(remove, name='')
File "...\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\util\deprecation.py", line 432, in new_func
return func(*args, **kwargs)
File "...\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\framework\importer.py", line 493, in import_graph_def
raise ValueError(str(e))
ValueError: Node 'Preprocessor/map/TensorArray_2': Unknown input node 'Preprocessor/map/strided_slice'
I have trained my model with tf 1.8, replaced your configuration and model by mines and tried a run. The same issue occurs when I try a run with your release 1.0.
For information :
Ok my bad, i turned off SPLIT_MODEL and it works now.
Dont use v2.0 Use master. I will update that next week
@AnthonyLabaere Hi, after turning on SPLIT_MODEL, your model works now? ValueError: Node 'Preprocessor/map/TensorArray_2' gone?
Again: the split_model speed hack will ONLY work with ssd_mobilenet_v1 Models that are exported from the exact same checkpoint that I used and published in /models. Tensorflow and also the SSDMetaArch inside models/object_detection changes.
I have no insight on this as I am not working with ssd anymore. If you want to apply the speed hack to those models you need to investigate by your own. Sorry.
But if you find a solution you are very welcome to contribute / file a PR.
Gustav
@GustavZ ok, thanks!
@David-Lee-1990 I just succeeded to make it work on my computer (on Windows) and on my raspberry (with some updates) with my model. And yes the issue with 'Preprocessor/map/TensorArray_2' is gone because this part (with SPLIT_MODEL true) concerns the GPU.
@GustavZ If I find a "real" solution I would make a PR but for now I didn't find anything :/ Ok I will use master i nthe future.
@AnthonyLabaere is your model trained by tensorflow's object detection api? what do you mean by saying ''Preprocessor/map/TensorArray_2' is gone because this part concerns the GPU'?
I check the frozen graph generated by tensorflow, and find after the node 'TensorArray_2' , the graph directly goes to Batch-NMS nodes without feature extraction.
@David-Lee-1990 yes it is trained by tensorflow's object detection api. Well, concerning the 'Preprocessor/map/TensorArray_2', I spoke too fast. I don't know why the problem is gone sorry.
How do you see that ? With tensorboard ?
Hi,
Split model hack solution is only avaiable in ssd_mobilenet_v1 with 300x300. 'Preprocessor/map/TensorArray_2' that appears with 600x600 train image.
Set your ssd_mobilenet_v1_coco.config with 300x300 size.
image_resizer {
fixed_shape_resizer {
height: 300
width: 300
}
@naisy Hi, have you tried this 300300 config? In fact, my config is set with 300 300 all the time, but there is still the error.
Hi @David-Lee-1990,
I check config now. config in master branch was changed. Please use r1.5 branch for ssd_mobilenet_v1.
--- r1.5 2018-06-18 01:43:31.752331891 +0000
+++ master 2018-06-18 01:43:18.056376250 +0000
@@ -108,12 +108,10 @@
loss {
classification_loss {
weighted_sigmoid {
- anchorwise_output: true
}
}
localization_loss {
weighted_smooth_l1 {
- anchorwise_output: true
}
}
hard_example_miner {
@@ -193,5 +191,4 @@
label_map_path: "PATH_TO_BE_CONFIGURED/mscoco_label_map.pbtxt"
shuffle: false
num_readers: 1
- num_epochs: 1
}
My own training is here: https://github.com/naisy/train_ssd_mobilenet
@naisy Thank you for your tips. Problem solved!
Hi,
I was wondering if there is any method that would let us retrain this model using Pascal voc a notion files and images ???