facebookresearch / VLPart

[ICCV2023] VLPart: Going Denser with Open-Vocabulary Part Segmentation
MIT License
335 stars 16 forks source link

object detection on aerial image #6

Open meidachen opened 9 months ago

meidachen commented 9 months ago

Thanks for the great work! I'm hoping to test VLPart for detecting objects from aerial images. However, The results are not as expected. Could you please help me? I wasn't sure if I was using the repo incorrectly or if this was the limitation of the released model on the aerial image. Thank you in advance for your help!

Here is a result I got by running :

python demo/demo.py --config-file configs/joint_in/swinbase_cascade_lvis_paco_pascalpart_partimagenet_inparsed.yaml --input 2.jpg --output output_image --vocabulary custom --custom_vocabulary "road,building,window,tree,car,light pole" --confidence-threshold 0.7 --opts MODEL.WEIGHTS models/swinbase_cascade_lvis_paco_pascalpart_partimagenet_inparsed.pth VIS.BOX False

2

I also tried to lower the confidence-threshold, but other than the cars, nothing was detected.

PeizeSun commented 9 months ago

Hi, this is because "road,building,window,tree,light pole" are not in the training dataset of lvis, paco, pascalpart, partimagenet. We are sorry for that. To enable the model to recognize these thing/stuff, we need to add the related training data.