lp6m commented 2 years ago

Thankfully, this repository has received STAR from many people, so we will write a custom model integration tutorial in response to their requests.

lp6m commented 2 years ago

Train on custom dataset

Here, we train yolov5s model on BDD100K dataset.
Create account on bdd100k website, download 100K images and Labels.
Screenshot from 2022-06-21 01-44-25

Convert BDD100K dataset to YOLO format.

import json
import os
import glob
import cv2

categories = {'car': 0,
              'bus': 1,
              'person': 2,
              'bike': 3,
              'truck': 4,
              'motor': 5,
              'train': 6,
              'rider': 7,
              'traffic sign': 8,
              'traffic light': 9}

def generate_yolo_labels(json_path, img_base_path, save_path):
    count = 0
    ignore_categories = ["drivable area", "lane"]
    with open(json_path) as json_file:
        data = json.load(json_file)
        if not os.path.exists(save_path):
            os.makedirs(save_path)
        print(len(data))
        for image_data in data:
            img_name = image_data['name']
            img_path = os.path.join(img_base_path, img_name)
            if not os.path.exists(img_path):
                continue
            file_path = os.path.join(save_path, img_name.replace(".jpg", ".txt"))
            img = cv2.imread(img_path)
            img_h, img_w, _ = img.shape
            img_labels = [l for l in image_data['labels']
                          if l['category'] not in ignore_categories]
            with open(file_path, 'w') as f_label:
                for label in img_labels:
                    y1 = label['box2d']['y1']
                    x2 = label['box2d']['x2']
                    x1 = label['box2d']['x1']
                    y2 = label['box2d']['y2']
                    class_name = label['category']
                    class_id = categories[class_name]

                    bbox_x = (x1 + x2)/2
                    bbox_y = (y1 + y2)/2

                    bbox_width = x2-x1
                    bbox_height = y2-y1

                    bbox_x_norm = bbox_x / img_w
                    bbox_y_norm = bbox_y / img_h

                    bbox_width_norm = bbox_width / img_w
                    bbox_height_norm = bbox_height / img_h

                    line_to_write = '{} {} {} {} {}'.format(
                        class_id, bbox_x_norm, bbox_y_norm, bbox_width_norm, bbox_height_norm)
                    f_label.write(line_to_write + "\n")
            count += 1
    print(f"generate complete for {count} images.")

generate_yolo_labels(
    "/workspace/dataset/bdd100k/labels/bdd100k_labels_images_train.json",
    "/workspace/dataset/bdd100k/images/100k/train",
    "/workspace/dataset/bdd100k/labels/100k/train"
)

generate_yolo_labels(
    "/workspace/dataset/bdd100k/labels/bdd100k_labels_images_val.json",
    "/workspace/dataset/bdd100k/images/100k/val",
    "/workspace/dataset/bdd100k/labels/100k/val"
)

docker run -it --shm-size=8g --gpus all -v `pwd`:/workspace yolov5s_android:latest bash
export PYTHONIOENCODING=utf8
python3 train.py --img 640 --batch 4 --epochs 25 --data bdd100k.yaml --weights yolov5s.pt

data/bdd100k.yaml

path: /workspace/dataset/bdd100k/images/100k  # dataset root dir
train: train
val: val
test:  # test images (optional)
# Classes
nc: 10  # number of classes
names: ['car', 'bus', 'person', 'bike', 'truck', 'motor', 'train', 'rider', 'traffic sign', 'traffic light']  # class names

Screenshot from 2022-06-21 03-38-44

We terminate training at 19 epoch because mAP is enough for test.
Screenshot from 2022-06-21 10-19-07

lp6m commented 2 years ago

Model conversion and quantization

We modify quantize.py to use local image data for calibration dataset instead of tfds. https://github.com/lp6m/yolov5s_android/commit/288985430df60a9ac1396ad461cb729c83936f87

input_size = 640

cd /workspace/yolov5/
cp runs/train/exp8/weights/best.pt ./yolov5s.pt
python3 export.py --weights ./yolov5s.pt --img-size 640 640 --simplify
python3 /opt/intel/openvino_2021.3.394/deployment_tools/model_optimizer/mo.py  --input_model yolov5s.onnx  --input_shape [1,3,640,640]  --output_dir ./openvino  --data_type FP32  --output Conv_253,Conv_302,Conv_351
source /opt/intel/openvino_2021/bin/setupvars.sh 
export PYTHONPATH=/opt/intel/openvino_2021/python/python3.6/:$PYTHONPATH
openvino2tensorflow \
--model_path ./openvino/yolov5s.xml \
--model_output_path tflite \
--output_pb \
--output_saved_model \
--output_no_quant_float32_tflite 
cp /workspace/yolov5/tflite/model_float32.tflite ../tflite_model/yolov5s_fp32_640.tflite 
cd /workspace/convert_model
python3 quantize.py --image_dir /workspace/dataset/bdd100k/images/100k/val/ --input_size 640
cp /workspace/yolov5/tflite/model_quantized.tflite ../tflite_model/yolov5s_int8_640.tflite

input_size = 320

python3 export.py --weights ./yolov5s.pt --img-size 320 320 --simplify
python3 /opt/intel/openvino_2021.3.394/deployment_tools/model_optimizer/mo.py  --input_model yolov5s.onnx  --input_shape [1,3,320,320]  --output_dir ./openvino  --data_type FP32  --output Conv_253,Conv_302,Conv_351
source /opt/intel/openvino_2021/bin/setupvars.sh 
export PYTHONPATH=/opt/intel/openvino_2021/python/python3.6/:$PYTHONPATH
openvino2tensorflow \
--model_path ./openvino/yolov5s.xml \
--model_output_path tflite \
--output_pb \
--output_saved_model \
--output_no_quant_float32_tflite 
cp /workspace/yolov5/tflite/model_float32.tflite ../tflite_model/yolov5s_fp32_320.tflite 
cd /workspace/convert_model
python3 quantize.py --image_dir /workspace/dataset/bdd100k/images/100k/val/ --input_size 320
cp /workspace/yolov5/tflite/model_quantized.tflite ../tflite_model/yolov5s_int8_320.tflite

Converted models are located at https://github.com/lp6m/yolov5s_android/tree/custom_train/tflite_model

lp6m commented 2 years ago

Model Test

We modify some scripts in ./host to support custom class_num and class_txt. https://github.com/lp6m/yolov5s_android/commit/4ca3e2757cdcd051267d1eda58da51b42a1437b8 We have not modified evaluate.py for mAP measurement.

fp32 640x640

cd /workspace/host
python3 detect.py \
--image ../dataset/bdd100k/images/100k/val/b2d8704e-66d10551.jpg  \
--class_num 10 \
--class_txt myclass.txt \
--input_size 640 \
--model_path ../tflite_model/yolov5s_fp32_640.tflite

result

int8 640x640

python3 detect.py \
--image ../dataset/bdd100k/images/100k/val/b2d8704e-66d10551.jpg  \
--class_num 10 \
--class_txt myclass.txt \
--input_size 640 \
--quantize_mode \
--model_path ../tflite_model/yolov5s_int8_640.tflite

result

fp32 320x320

python3 detect.py \
--image ../dataset/bdd100k/images/100k/val/b2d8704e-66d10551.jpg  \
--class_num 10 \
--class_txt myclass.txt \
--input_size 320 \
--model_path ../tflite_model/yolov5s_fp32_320.tflite

result

int8 320x320

python3 detect.py \
--image ../dataset/bdd100k/images/100k/val/b2d8704e-66d10551.jpg  \
--class_num 10 \
--class_txt myclass.txt \
--input_size 320 \
--quantize_mode \
--model_path ../tflite_model/yolov5s_int8_320.tflite

result

lp6m commented 2 years ago

Build App

We modified some application codes to support 10 class.
https://github.com/lp6m/yolov5s_android/commit/59ecd8c6a3c5fd537d4bebcdede2fd6f450ae98f https://github.com/lp6m/yolov5s_android/commit/f334040ecb7ca2cbeda6e04f186caba0396939b6

cpp/postprocess.cpp: modify class_num from 80 to 10.
MainActivity.java remove using get_coco91_from_coco80
TfliteRunner.java remove using get_coco91_from_coco80, modify output buffer size, class label strings.

zenetio commented 2 years ago

@lp6m , thanks for this tutorial clarifying many previous discussion points. I have a question about get_coco91_from_coco80 that sounds not clear. Coco has 80 objects and your custom data is adding 10 new objects. If we have now 90 objects and the string list starts at index 1, the variable name is correct when referencing 91, but in the code, the string index has only 90 (1->90). Am I missing something here?

lp6m commented 2 years ago

If you don't use Coco dataset, remove 'get_coco91_from_coco80' function as this tutorial did. 'get_coco91_from_coco80' is only for Coco dataset. Coco annotation json file has 91 classes, but the trained model has only 80 classes, so we have to convert class label for mAP meajurement.

c.f. https://tech.amikelive.com/node-718/what-object-categories-labels-are-in-coco-dataset/

sandyflute commented 2 years ago

Hello @lp6m -san, I am making same changes for 2 classes but i get the dimension relate errors:

Cannot copy from a TensorFlowLite tensor (Identity) with shape [1,80,80,28] to a java object with shape [1,80,80,21]

Thanks for sharing the code. am I missing something ? And I am not able to use the camera for live inference similar to @Houangnt. But i will create a separate issue for that.

Houangnt commented 2 years ago

i got the error :(((

lp6m commented 2 years ago

@Houangnt Can you attach tflite model file?

lp6m commented 2 years ago

Try to change conf_thresh and retry inference.

zenetio commented 2 years ago

I trained my model with a custom dataset with 4 classes. The torch evaluation was very good. When I try to run inference using int8 or fp32, I am getting the following error: Inference failed: Cannot copy from a TensorFlowLite tensor (identity) with shape [1,80,80,27] to a Java object with shape [1,80,80,255] Any idea if this is related to a bad model conversion?

lp6m commented 2 years ago

@zenetio You have to modify the android app source code: https://github.com/lp6m/yolov5s_android/commit/59ecd8c6a3c5fd537d4bebcdede2fd6f450ae98f#diff-ffad9731e5056a3ec008d0ef0df43f04ec064b4d90044c696e43cd5abfb2f8a9L46

zenetio commented 2 years ago

@lp6m Thanks for the replay. I had made changes but after checking your comments I found missing one update. Now it is working. The predictions are not so good and I will train with more epochs.

zenetio commented 2 years ago

@sandyflute It seems you are facing the same issue I had. Please check the comment from @lp6m above.

lp6m commented 2 years ago

I found a bug in postprocess.cpp and fixed.

https://github.com/lp6m/yolov5s_android/commit/f334040ecb7ca2cbeda6e04f186caba0396939b6

lp6m commented 2 years ago

@Houangnt @zenetio Please check this bug.

zenetio commented 2 years ago

@lp6m good catch about the postprocess.cpp file. The output of my model was crazy but now it is predicting correctly all the classes. Thanks

lp6m commented 2 years ago

@zenetio please create a new issue.

rajeshgangireddy commented 2 years ago

@lp6m I am using a YOLOv5x6 with input size of 1280x1280 (a bit of an overkill, but I wanted to test this). I figured out the changes required for converting to TFLITE. But not sure of the changes required in the cpp file. Could you please tell what changes would be required in the postprocess.cpp. There are 4 Convs instead of 3 as shown. Edit : I figured it out. After a couple of changes, it works :).

sandyflute commented 2 years ago

@zenetio and @lp6m , My issue was was not related to bug fix that @lp6m-san has provided. I had that already fixed in my code. While training the model, I did hyper-parameter evolution, provided by Yolov5. Because of this, the number anchors changed, and so the output buffer size also changed. For now, i have manually hard-coded the change in the app but i am looking for better ways to handle it.

lp6m commented 2 years ago

@sandyflute To support different types of models, we need implementation of getting the output buffer size from the tflite model, but we do not plan to update it at this time.

sandyflute commented 2 years ago

@lp6m-san , thanks for sharing this code publicly. i have used yolov5s only.

Houangnt commented 2 years ago

@Houangnt @zenetio Please check this bug.

i fixed it, but I can't get the speed as good as your model ( i still used architecture's yolov5s with 4 class)

lp6m / yolov5s_android

Custom Model Intergration Tutorial #14

Train on custom dataset

Model conversion and quantization

Model Test

Build App