PaddlePaddle / PaddleMIX

Paddle Multimodal Integration and eXploration, supporting mainstream multi-modal tasks, including end-to-end large-scale multi-modal pretrain models and diffusion model toolbox. Equipped with high performance and flexibility.
Apache License 2.0
239 stars 100 forks source link

Add YOLO-World forward and inference #530

Closed InsaneOnion closed 1 month ago

InsaneOnion commented 1 month ago

Add YOLO-World forward and inference some examples of forward alignment:

  1. bus.jpg torch commands: python image_demo.py configs/pretrain/yolo_world_v2_x_vlpan_bn_2e-3_100e_4x8gpus_obj365v1_goldg_train_lvis_minival.py /home/onion/workspace/code/PaddleMIX/ppdiffusers/examples/YOLO-World/pretrain/yolo_world_v2_x_obj365v1_goldg_cc3mlite_pretrain-8698fbfa.pth /home/onion/ bus.jpg 'person,bus' --topk 100 --threshold 0.001 --output-dir /home/onion/out/ paddle commands: python infer.py --config configs/yolo_world_x.yml -o weights=./pretrain/yolo_world_v2_x_obj365v1_goldg_cc3mlite_pretrain-8698fbfa.pdparams --image /home/onion/bus.jpg --text 'person,bus' --topk 100 --threshold 0.01 --output_dir "/home/onion/out"
  1. zidane.jpg torch commands: python image_demo.py configs/pretrain/yolo_world_v2_x_vlpan_bn_2e-3_100e_4x8gpus_obj365v1_goldg_train_lvis_minival.py /home/onion/workspace/code/PaddleMIX/ppdiffusers/examples/YOLO-World/pretrain/yolo_world_v2_x_obj365v1_goldg_cc3mlite_pretrain-8698fbfa.pth /home/onion/ zidane.jpg 'bald man,white haired man' --topk 100 --threshold 0.001 --output-dir /home/onion/out/ paddle commands: python infer.py --config configs/yolo_world_x.yml -o weights=./pretrain/yolo_world_v2_x_obj365v1_goldg_cc3mlite_pretrain-8698fbfa.pdparams --image /home/onion/zidane.jpg --text 'bald man,white haired man' --topk 100 --threshold 0.001 --output_dir "/home/onion/out"
paddle-bot[bot] commented 1 month ago

Thanks for your contribution!

LokeZhou commented 1 month ago

整体文件移到PaddleMIX/paddlemix/examples下