Closed lp6m closed 2 years ago
Thankfully, this repository has received STAR from many people, so we will write a custom model integration tutorial in response to their requests.
Here, we train yolov5s model on BDD100K dataset.
Create account on bdd100k website, download 100K images
and Labels
.
Convert BDD100K dataset to YOLO format.
import json
import os
import glob
import cv2
categories = {'car': 0,
'bus': 1,
'person': 2,
'bike': 3,
'truck': 4,
'motor': 5,
'train': 6,
'rider': 7,
'traffic sign': 8,
'traffic light': 9}
def generate_yolo_labels(json_path, img_base_path, save_path):
count = 0
ignore_categories = ["drivable area", "lane"]
with open(json_path) as json_file:
data = json.load(json_file)
if not os.path.exists(save_path):
os.makedirs(save_path)
print(len(data))
for image_data in data:
img_name = image_data['name']
img_path = os.path.join(img_base_path, img_name)
if not os.path.exists(img_path):
continue
file_path = os.path.join(save_path, img_name.replace(".jpg", ".txt"))
img = cv2.imread(img_path)
img_h, img_w, _ = img.shape
img_labels = [l for l in image_data['labels']
if l['category'] not in ignore_categories]
with open(file_path, 'w') as f_label:
for label in img_labels:
y1 = label['box2d']['y1']
x2 = label['box2d']['x2']
x1 = label['box2d']['x1']
y2 = label['box2d']['y2']
class_name = label['category']
class_id = categories[class_name]
bbox_x = (x1 + x2)/2
bbox_y = (y1 + y2)/2
bbox_width = x2-x1
bbox_height = y2-y1
bbox_x_norm = bbox_x / img_w
bbox_y_norm = bbox_y / img_h
bbox_width_norm = bbox_width / img_w
bbox_height_norm = bbox_height / img_h
line_to_write = '{} {} {} {} {}'.format(
class_id, bbox_x_norm, bbox_y_norm, bbox_width_norm, bbox_height_norm)
f_label.write(line_to_write + "\n")
count += 1
print(f"generate complete for {count} images.")
generate_yolo_labels(
"/workspace/dataset/bdd100k/labels/bdd100k_labels_images_train.json",
"/workspace/dataset/bdd100k/images/100k/train",
"/workspace/dataset/bdd100k/labels/100k/train"
)
generate_yolo_labels(
"/workspace/dataset/bdd100k/labels/bdd100k_labels_images_val.json",
"/workspace/dataset/bdd100k/images/100k/val",
"/workspace/dataset/bdd100k/labels/100k/val"
)
docker run -it --shm-size=8g --gpus all -v `pwd`:/workspace yolov5s_android:latest bash
export PYTHONIOENCODING=utf8
python3 train.py --img 640 --batch 4 --epochs 25 --data bdd100k.yaml --weights yolov5s.pt
path: /workspace/dataset/bdd100k/images/100k # dataset root dir
train: train
val: val
test: # test images (optional)
# Classes
nc: 10 # number of classes
names: ['car', 'bus', 'person', 'bike', 'truck', 'motor', 'train', 'rider', 'traffic sign', 'traffic light'] # class names
We terminate training at 19 epoch because mAP is enough for test.
We modify quantize.py
to use local image data for calibration dataset instead of tfds. https://github.com/lp6m/yolov5s_android/commit/288985430df60a9ac1396ad461cb729c83936f87
cd /workspace/yolov5/
cp runs/train/exp8/weights/best.pt ./yolov5s.pt
python3 export.py --weights ./yolov5s.pt --img-size 640 640 --simplify
python3 /opt/intel/openvino_2021.3.394/deployment_tools/model_optimizer/mo.py --input_model yolov5s.onnx --input_shape [1,3,640,640] --output_dir ./openvino --data_type FP32 --output Conv_253,Conv_302,Conv_351
source /opt/intel/openvino_2021/bin/setupvars.sh
export PYTHONPATH=/opt/intel/openvino_2021/python/python3.6/:$PYTHONPATH
openvino2tensorflow \
--model_path ./openvino/yolov5s.xml \
--model_output_path tflite \
--output_pb \
--output_saved_model \
--output_no_quant_float32_tflite
cp /workspace/yolov5/tflite/model_float32.tflite ../tflite_model/yolov5s_fp32_640.tflite
cd /workspace/convert_model
python3 quantize.py --image_dir /workspace/dataset/bdd100k/images/100k/val/ --input_size 640
cp /workspace/yolov5/tflite/model_quantized.tflite ../tflite_model/yolov5s_int8_640.tflite
input_size = 320
python3 export.py --weights ./yolov5s.pt --img-size 320 320 --simplify
python3 /opt/intel/openvino_2021.3.394/deployment_tools/model_optimizer/mo.py --input_model yolov5s.onnx --input_shape [1,3,320,320] --output_dir ./openvino --data_type FP32 --output Conv_253,Conv_302,Conv_351
source /opt/intel/openvino_2021/bin/setupvars.sh
export PYTHONPATH=/opt/intel/openvino_2021/python/python3.6/:$PYTHONPATH
openvino2tensorflow \
--model_path ./openvino/yolov5s.xml \
--model_output_path tflite \
--output_pb \
--output_saved_model \
--output_no_quant_float32_tflite
cp /workspace/yolov5/tflite/model_float32.tflite ../tflite_model/yolov5s_fp32_320.tflite
cd /workspace/convert_model
python3 quantize.py --image_dir /workspace/dataset/bdd100k/images/100k/val/ --input_size 320
cp /workspace/yolov5/tflite/model_quantized.tflite ../tflite_model/yolov5s_int8_320.tflite
Converted models are located at https://github.com/lp6m/yolov5s_android/tree/custom_train/tflite_model
We modify some scripts in ./host
to support custom class_num
and class_txt
. https://github.com/lp6m/yolov5s_android/commit/4ca3e2757cdcd051267d1eda58da51b42a1437b8
We have not modified evaluate.py
for mAP measurement.
fp32 640x640
cd /workspace/host
python3 detect.py \
--image ../dataset/bdd100k/images/100k/val/b2d8704e-66d10551.jpg \
--class_num 10 \
--class_txt myclass.txt \
--input_size 640 \
--model_path ../tflite_model/yolov5s_fp32_640.tflite
int8 640x640
python3 detect.py \
--image ../dataset/bdd100k/images/100k/val/b2d8704e-66d10551.jpg \
--class_num 10 \
--class_txt myclass.txt \
--input_size 640 \
--quantize_mode \
--model_path ../tflite_model/yolov5s_int8_640.tflite
python3 detect.py \
--image ../dataset/bdd100k/images/100k/val/b2d8704e-66d10551.jpg \
--class_num 10 \
--class_txt myclass.txt \
--input_size 320 \
--model_path ../tflite_model/yolov5s_fp32_320.tflite
python3 detect.py \
--image ../dataset/bdd100k/images/100k/val/b2d8704e-66d10551.jpg \
--class_num 10 \
--class_txt myclass.txt \
--input_size 320 \
--quantize_mode \
--model_path ../tflite_model/yolov5s_int8_320.tflite
We modified some application codes to support 10 class.
https://github.com/lp6m/yolov5s_android/commit/59ecd8c6a3c5fd537d4bebcdede2fd6f450ae98f
https://github.com/lp6m/yolov5s_android/commit/f334040ecb7ca2cbeda6e04f186caba0396939b6
cpp/postprocess.cpp
: modify class_num from 80 to 10.MainActivity.java
remove using get_coco91_from_coco80
TfliteRunner.java
remove using get_coco91_from_coco80
, modify output buffer size, class label strings. @lp6m , thanks for this tutorial clarifying many previous discussion points. I have a question about
get_coco91_from_coco80
that sounds not clear. Coco has 80 objects and your custom data is adding 10 new objects. If we have now 90 objects and the string list starts at index 1, the variable name is correct when referencing 91, but in the code, the string index has only 90 (1->90). Am I missing something here?
If you don't use Coco dataset, remove 'get_coco91_from_coco80' function as this tutorial did. 'get_coco91_from_coco80' is only for Coco dataset. Coco annotation json file has 91 classes, but the trained model has only 80 classes, so we have to convert class label for mAP meajurement.
c.f. https://tech.amikelive.com/node-718/what-object-categories-labels-are-in-coco-dataset/
Hello @lp6m -san, I am making same changes for 2 classes but i get the dimension relate errors:
Cannot copy from a TensorFlowLite tensor (Identity) with shape [1,80,80,28] to a java object with shape [1,80,80,21]
Thanks for sharing the code. am I missing something ? And I am not able to use the camera for live inference similar to @Houangnt. But i will create a separate issue for that.
i got the error :(((
@Houangnt Can you attach tflite model file?
Try to change conf_thresh
and retry inference.
I trained my model with a custom dataset with 4 classes. The torch evaluation was very good. When I try to run inference using int8 or fp32, I am getting the following error: Inference failed: Cannot copy from a TensorFlowLite tensor (identity) with shape [1,80,80,27] to a Java object with shape [1,80,80,255] Any idea if this is related to a bad model conversion?
@zenetio You have to modify the android app source code: https://github.com/lp6m/yolov5s_android/commit/59ecd8c6a3c5fd537d4bebcdede2fd6f450ae98f#diff-ffad9731e5056a3ec008d0ef0df43f04ec064b4d90044c696e43cd5abfb2f8a9L46
@lp6m Thanks for the replay. I had made changes but after checking your comments I found missing one update. Now it is working. The predictions are not so good and I will train with more epochs.
@sandyflute It seems you are facing the same issue I had. Please check the comment from @lp6m above.
I found a bug in postprocess.cpp
and fixed.
https://github.com/lp6m/yolov5s_android/commit/f334040ecb7ca2cbeda6e04f186caba0396939b6
@Houangnt @zenetio Please check this bug.
@lp6m good catch about the postprocess.cpp file. The output of my model was crazy but now it is predicting correctly all the classes. Thanks
@zenetio please create a new issue.
@lp6m I am using a YOLOv5x6 with input size of 1280x1280 (a bit of an overkill, but I wanted to test this). I figured out the changes required for converting to TFLITE. But not sure of the changes required in the cpp file. Could you please tell what changes would be required in the postprocess.cpp. There are 4 Convs instead of 3 as shown. Edit : I figured it out. After a couple of changes, it works :).
@zenetio and @lp6m , My issue was was not related to bug fix that @lp6m-san has provided. I had that already fixed in my code. While training the model, I did hyper-parameter evolution, provided by Yolov5. Because of this, the number anchors changed, and so the output buffer size also changed. For now, i have manually hard-coded the change in the app but i am looking for better ways to handle it.
@sandyflute To support different types of models, we need implementation of getting the output buffer size from the tflite model, but we do not plan to update it at this time.
@lp6m-san , thanks for sharing this code publicly. i have used yolov5s only.
@Houangnt @zenetio Please check this bug.
i fixed it, but I can't get the speed as good as your model ( i still used architecture's yolov5s with 4 class)
Working Branch: custom_train