AILab-CVC / YOLO-World

[CVPR 2024] Real-Time Open-Vocabulary Object Detection
https://www.yoloworld.cc
GNU General Public License v3.0
3.88k stars 372 forks source link

Thanks for your work,excellent! some question about yolo-world finetune freeze, glip Pseudo label and prompt. #369

Open Meicsu199345 opened 1 month ago

Meicsu199345 commented 1 month ago

Question

Thanks for your work,excellent! background I have a dataset about risk_control 100classes 200k images and 400k bboxes。 class such as 1:vulgar tongue kiss 2:porn bed photos 3:human smoking scenes

i want use it not only detect person ,car, human smoking in bed, so i need Re-parameterization?

question Q1: can i get some suggestion about my method is correct?

Q2:how can i use yolo world to do prompt? image

Q3:how can i get Pseudo label(bboxes)by glip use this github-rep? image

Q4:how can i do A.1. Re-parameterization for RepVL-PAN use ultralytics? image

thanks a lot, yolo world let my study easy~

wondervictor commented 3 weeks ago

Hi @Meicsu199345, sorry for the late reply!! Thanks for your interest in YOLO-World!

  1. Using torch~=1.13 and requirements.txt to install is fine and we'll remove the support for torch~2.0.
  2. We only freeze the CLIP backbone for fine-tuning, other parameters are trainable. If you have a large vocabulary, you can also make CLIP trainable.
  3. For prompt tuning, you can refer to docs/prompt_yolo_world. You just need to precompute the text embeddings and use the SimpleYOLOWorldDetector now.
  4. The pseudo labels are listed here: docs/data.
  5. The repo is independent from Ultralytics and we only support re-parameterization in this repo.
Meicsu199345 commented 1 week ago

Hi @Meicsu199345, sorry for the late reply!! Thanks for your interest in YOLO-World!

  1. Using torch~=1.13 and requirements.txt to install is fine and we'll remove the support for torch~2.0.
  2. We only freeze the CLIP backbone for fine-tuning, other parameters are trainable. If you have a large vocabulary, you can also make CLIP trainable.
  3. For prompt tuning, you can refer to docs/prompt_yolo_world. You just need to precompute the text embeddings and use the SimpleYOLOWorldDetector now.
  4. The pseudo labels are listed here: docs/data.
  5. The repo is independent from Ultralytics and we only support re-parameterization in this repo.

Thank you very much for your answer~