[2024/3/16]
[2024/3/6]
[2023/12/6]
[2023/11/23]
[2023/2/28]
[2023/2/24]
[2023/2/20]
[2023/2/19]
Model | Size | mAPval 0.5:0.95 |
mAPval 0.5 |
FPSAGX Xavier trt fp16 batch=16 include NMS |
Params train / infer(M) |
Download |
---|---|---|---|---|---|---|
EdgeYOLO-Tiny-LRELU | 416640 | 33.137.8 | 50.556.7 | 206109 | 7.6 / 7.0 | github |
EdgeYOLO-Tiny | 416640 | 37.241.4 | 55.460.4 | 13667 | 5.8 / 5.5 | github |
EdgeYOLO-S | 640 | 44.1 | 63.3 | 53 | 9.9 / 9.3 | github |
EdgeYOLO-M | 640 | 47.5 | 66.6 | 46 | 19.0 / 17.8 | github |
EdgeYOLO | 640 | 50.6 | 69.8 | 34 | 41.2 / 40.5 | github |
Model | Size | mAPval 0.5:0.95 |
mAPval 0.5 |
Download |
---|---|---|---|---|
EdgeYOLO-Tiny-LRELU | 416640 | 12.118.5 | 22.833.6 | github |
EdgeYOLO-Tiny | 416640 | 14.921.8 | 27.338.5 | github |
EdgeYOLO-S | 640 | 23.6 | 40.8 | github |
EdgeYOLO-M | 640 | 25.0 | 42.9 | github |
EdgeYOLO | 640 | 25.926.9 | 43.945.4 | github(legacy)github(new) |
git clone https://github.com/LSH9832/edgeyolo.git
cd edgeyolo
pip install -r requirements.txt
if you use tensorrt, please make sure torch2trt and TensorRT Development Toolkit(version>7.1.3.0) are installed.
git clone https://github.com/NVIDIA-AI-IOT/torch2trt.git
cd torch2trt
python setup.py install
or to make sure you use the same version of torch2trt as ours, download here
download docker image from Baiduyun, 14.3G, pwd: ujar
docker import edgeyolo_deploy.tar.gz edgeyolo:latest
run docker
docker run -it \
--runtime=nvidia \
-e NVIDIA_DRIVER_CAPABILITIES=compute,utility \
-e NVIDIA_VISIBLE_DEVICES=all \
--shm-size 15g \
-w /code \
-v "/path/to/your/edgeyolo/parent_dir":/code \
-v "/path/to/your/dataset/parent_dir":/dataset \
edgeyolo:latest
then you can use "docker_export.py" instead of "export.py"
First download weights here
python detect.py --weights edgeyolo_coco.pth --source XXX.mp4 --fp16
# all options
python detect.py --weights edgeyolo_coco.pth
--source /XX/XXX.mp4 # or dir with images, such as /dataset/coco2017/val2017 (jpg/jpeg, png, bmp, webp is available)
--conf-thres 0.25
--nms-thres 0.5
--input-size 640 640
--batch 1
--save-dir ./output/detect/imgs # if you press "s", the current frame will be saved in this dir
--fp16
--no-fuse # do not reparameterize model
--no-label # do not draw label with class name and confidence
--mp # use multi-process to show images more smoothly when batch > 1
--fps 30 # max fps limitation, valid only when option --mp is used
(COCO, YOLO, VOC, VisDrone and DOTA formats are supported)
type: "coco" # dataset format(lowercase),COCO, YOLO, VOC, VisDrone and DOTA formats are supported currently
dataset_path: "/dataset/coco2017" # root dir of your dataset
kwargs:
suffix: "jpg" # suffix of your dataset's images
use_cache: true # test on i5-12490f: Total loading time: 52s -> 10s(seg enabled) and 39s -> 4s(seg disabled)
train:
image_dir: "images/train2017" # train set image dir
label: "annotations/instances_train2017.json" # train set label file(format with single label file) or directory(multi label files)
val:
image_dir: "images/val2017" # evaluate set image dir
label: "annotations/instances_val2017.json" # evaluate set label file or directory
test:
test_dir: "test2017" # test set image dir (not used in code now, but will)
segmentaion_enabled: true # whether this dataset has segmentation labels and you are going to use them instead of bbox labels
names: ['person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus', 'train', 'truck', 'boat', 'traffic light',
'fire hydrant', 'stop sign', 'parking meter', 'bench', 'bird', 'cat', 'dog', 'horse', 'sheep', 'cow',
'elephant', 'bear', 'zebra', 'giraffe', 'backpack', 'umbrella', 'handbag', 'tie', 'suitcase', 'frisbee',
'skis', 'snowboard', 'sports ball', 'kite', 'baseball bat', 'baseball glove', 'skateboard', 'surfboard',
'tennis racket', 'bottle', 'wine glass', 'cup', 'fork', 'knife', 'spoon', 'bowl', 'banana', 'apple',
'sandwich', 'orange', 'broccoli', 'carrot', 'hot dog', 'pizza', 'donut', 'cake', 'chair', 'couch',
'potted plant', 'bed', 'dining table', 'toilet', 'tv', 'laptop', 'mouse', 'remote', 'keyboard', 'cell phone',
'microwave', 'oven', 'toaster', 'sink', 'refrigerator', 'book', 'clock', 'vase', 'scissors', 'teddy bear',
'hair drier', 'toothbrush'] # category names
python train.py --cfg ./params/train/train_XXX.yaml
you can plot figures about loss, learning rate and precision(AP50 and AP50:95) curve using "plot.py"
python plot.py --all \ # plot all figures or (--lr, --ap, --loss)
-f ./output/train/edgeyolo_tiny_lrelu \ # train ouput path or (output_path/eval.yaml for --ap and output_path/log.txt for --lr and --loss)
--no-show \ # do not show by plt.show(), (for device without desktop env, or you just want to save the figs)
--save \ # save figures
--format pdf png svg jpg eps # save format
python evaluate.py --weights edgeyolo_coco.pth --dataset params/dataset/XXX.yaml --batch 16 --device 0
# all options
python evaluate.py --weights edgeyolo_coco.pth # or tensorrt model: output/export/edgeyolo_coco/model.pt
--dataset params/dataset/XXX.yaml
--batch 16 # batch size for each gpu, not valid if it's tensorrt model
--device 0
--input-size 640 640 # height, width
--trt # if you use tensorrt model add this option
--save # save weights without optimizer params and set epoch to -1
python export.py --onnx --weights edgeyolo_coco.pth --batch 1
python export.py --onnx # or --onnx-only if tensorrt and torch2trt are not installed --weights edgeyolo_coco.pth --input-size 640 640 # height, width --batch 1 --opset 11 --no-simplify # do not simplify this model
it generates
output/export/edgeyolo_coco/640x640_batch1.onnx
- TensorRT
```shell
# fp16
python export.py --trt --weights edgeyolo_coco.pth --batch 1 --workspace 8
# int8
python export.py --trt --weights edgeyolo_coco.pth --batch 1 --workspace 8 --int8 --dataset params/dataset/coco.yaml --num-imgs 1024
# all options
python export.py --trt # you can add --onnx and relative options to export both models
--weights edgeyolo_coco.pth
--input-size 640 640 # height, width
--batch 1
--workspace 10 # (GB)
--no-fp16 # fp16 mode in default, use this option to disable it(fp32)
--int8 # int8 mode, the following options are needed for calibration
--dataset params/dataset/coco.yaml # generates calibration images from its val images(upper limit:5120)
--train # use train images instead of val images(upper limit:5120)
--all # use all images(upper limit:5120)
--num-imgs 512 # (upper limit:5120)
it generates
(optional) output/export/edgeyolo_coco/640x640_batch1.onnx
output/export/edgeyolo_coco/640x640_batch1_fp16(int8).pt # for python inference
output/export/edgeyolo_coco/640x640_batch1_fp16(int8).engine # for c++ inference
output/export/edgeyolo_coco/640x640_batch1_fp16(int8).json # for c++ inference
COCO2017-TensorRT-int8
Int8 Model | Size | Calibration Image number | Workspace(GB) | mAPval 0.5:0.95 |
mAPval 0.5 |
FPSRTX 3060 trt int8 batch=16 include NMS |
---|---|---|---|---|---|---|
Tiny-LRELU | 416 640 |
512 | 8 | 31.5 36.4 |
48.7 55.5 |
730 360 |
Tiny | 416 640 |
512 | 8 | 34.9 39.8 |
53.1 59.5 |
549 288 |
S | 640 | 512 | 8 | 42.4 | 61.8 | 233 |
M | 640 | 512 | 8 | 45.2 | 64.2 | 211 |
L | 640 | 512 | 8 | 49.1 | 68.0 | 176 |
python detect.py --trt --weights output/export/edgeyolo_coco/640x640_batch1_int8.pt --source XXX.mp4
# all options
python detect.py --trt
--weights output/export/edgeyolo_coco/640x640_batch1_int8.pt
--source XXX.mp4
--legacy # if "img = img / 255" when you train your train model
--use-decoder # if use original yolox tensorrt model before version 0.3.0
--mp # use multi-process to show images more smoothly when batch > 1
--fps 30 # max fps limitation, valid only when option --mp is used
# build
cd cpp/tensorrt
mkdir build && cd build
cmake ..
make
# help
./yolo -?
./yolo --help
# run
# ./yolo [engine file] [source] [--conf] [--nms] [--loop] [--no-label]
# make sure your engine file and your yaml file are both in a same path
./yolo ../../../output/export/edgeyolo_coco/640x640_batch1_int8.engine ~/Videos/test.avi --conf 0.25 --nms 0.5 --loop --no-label
@article{edgeyolo2023,
title={EdgeYOLO: An Edge-Real-Time Object Detector},
author={Shihan Liu, Junlin Zha, Jian Sun, Zhuo Li, and Gang Wang},
journal={arXiv preprint arXiv:2302.07483},
year={2023}
}
File "XXX/edgeyolo/edgeyolo/train/loss.py", line 667, in dynamic_k_matching
_, pos_idx = torch.topk(cost[gt_idx], k=dynamic_ks[gt_idx].item(), largest=False)
RuntimeError: CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.