AILab-CVC / YOLO-World

[CVPR 2024] Real-Time Open-Vocabulary Object Detection
https://www.yoloworld.cc
GNU General Public License v3.0
4.28k stars 416 forks source link

tflite demo/inference issue #424

Open kezhang-cs opened 1 month ago

kezhang-cs commented 1 month ago

Hi @wondervictor and team, thank you for the great work. Could you help me on a question regarding the tflite model demo/inference? I first convert the _yolo_world_v2_s_obj365v1_goldg_pretrain1280ft-fc4ff4f7.pth model to onnx with cfg_path=configs/pretrain/yolo_world_v2_s_vlpan_bn_2e-3_100e_4x8gpus_obj365v1_goldg_train_1280ft_lvis_minival.py

python deploy/export_onnx.py ${cfg_path} ${wgt_path} --opset 12 --without-bbox-decoder --without-nms

onnx_demo.py results looks okay.

Then use

onnx2tf -i ${onnx_path} -o ${tflite_path} -oiqt  -cind "images" "tflite_calibration_data_100_images_640.npy" "[[[[0.,0.,0.]]]]" "[[[[1.,1.,1.]]]]"  -onimc "scores" "bboxes" --verbosity debug

To convert to tflite. But when running tflite_demo.py, the outputs seems not enough.

  File "deploy/tflite_demo.py", line 241, in main
    inference_per_sample(interpreter,
  File "deploy/tflite_demo.py", line 137, in inference_per_sample
    scores = interp.get_tensor(output_details[1]['index'])
IndexError: list index out of range

I was using the _integer_quant.tflite, and tflite_demo.py worked with the shared tflite file in the repo. Any idea if there are any issues on the steps I tried? Thank you!

kezhang-cs commented 1 month ago

After a closer check, I noticed the possible root cause here: for unknown reason, onnx2tf changed the outputs from 2 to 1. i.e. the onnx outputs has "scores", "boxes". But tflite output only has 1. Also after running onnx2tf, the "boxes" output seems removed as well on onnx file.

May I ask how did you manage to convert the onnx to tflite as in yolo_world_x_coco_zeroshot_rep_integer_quant.tflite? for yolo_world_x_coco_zeroshot_rep_integer_quant.tflite it has both scores and boxes. really looking forward to your insights, I have been working on this issue more than a week 🫠

kezhang-cs commented 1 month ago

I wanted to follow up on this question - the tflite I converted now has boxes and scores, but the boxes are quite different from onnx model and does not make sense.

Could you help provide more details on how we get yolo_world_x_coco_zeroshot_rep_integer_quant.tflite? Thanks!