ylshaooo / keras-zero-shot-detection

Keras implementation of zero-shot detection based on YOLOv3 model.
45 stars 15 forks source link

Zero-Shot-Detection

Introduction

A Keras implementation of Zero-Shot Detection Model based on YOLOv3 (Tensorflow backend), referring to keras-yolo3.


Results on PASCAL VOC


Train and Evaluate

  1. Generate your own annotation file and class names file.
    One row for one image.
    Row format: image_file_path box1 box2 ... boxN;
    Box format: x_min,y_min,x_max,y_max,class_id (no space).
    For VOC dataset, try python voc_annotation.py.
    Here is an example:

    path/to/img1.jpg 50,100,150,200,0 30,50,200,120,3
    path/to/img2.jpg 120,300,250,600,2
    ...
  2. Download YOLOv3 weights from YOLO website. The file model_data/yolo_weights.h5 is used to load pretrained weights.

  3. Modify train.py and start training python train.py. Use your trained weights or checkpoint weights in yolo.py. Remember to modify class path or anchor path.

  4. Test ZSD model and evaluate the results using mAP, or run the visualization demo.
    python test.py OR python demo.py.
    Test file is in the form: path/to/img, one row for one image.


Issues to know

  1. The train and test environment is

    • Python 3.5.4
    • Keras 2.2.0
    • tensorflow 1.6.0
  2. Default yolo anchors are used. If you use your own anchors, probably some changes are needed.

  3. The inference result is not totally the same as Darknet but the difference is small.

  4. The speed is slower than Darknet. Replacing PIL with OpenCV may help a little.

  5. Always load pretrained weights and freeze layers in the first stage of training. Or try Darknet training. It's OK if there is a mismatch warning.

  6. The training strategy is for reference only. Adjust it according to your dataset and your goal. And add further strategy if needed.

  7. For data and results analysis, you are recommended to run our scripts in the anaylsis directory as we provide.

  8. Better semantic description can improve the results, like attributes.