Open 12cyan opened 10 months ago
@12cyan To have strong REC (Referential Expression Comprehension) capability, it is necessary to train with a portion of REC data. Currently, we haven't used RefCOCO, but we plan to release a model trained with the complete RefCOCO dataset in the future.
After executing the code according to the command line, why is the detection effect so poor, and is it not configured well?
command :python demo/image_demo.py test_images/two_human/ configs/mm_grounding_dino/grounding_dino_swin-t_pretrain_obj365.py --weights grounding_dino_swin-t_pretrain_obj365_goldg_grit9m_v3det_20231204_095047-b448804b.pth --texts 'The person standing on the left.' --tokens-positive -1 --out-dir /home/wh/sjx/mm/mmdetection/outputs/ --device 'cpu'
result: