Open Ozziko opened 2 years ago
@Ozziko Thanks for your interests in our work, and your nice suggestion!
I've tried the installation (essentially it's the same as Detectron2's installation) on different machines and it worked well. Do you mind letting me know what problems showed up?
For concept embeddings, take zero-shot inference for example. You just need to simply replace the parameter of MODEL.CLIP.TEXT_EMB_PATH
with your embedding file. Then our model can already recognize regions into your custom concepts, by matching region features to your concept embeddings.
I've already provided an annotated script for model inference which you might be interested in :-)
@YiwuZhong Thanks for the fast reply! I don't remember the error I got in installations... Your annotated script seems useful for later.
Can you supply the exact command to use for ZS detection inference with your pretrained model on custom images & custom labels?
The one you supplied in the readme (below) didn't work because of the mismatch between the expected number of classes (in the yaml), and the actual one in the custom labels, and maybe there are more things to take into account - that's why I ask you...
!python3 ./tools/train_net.py \ --eval-only \ --num-gpus 1 \ --config-file ./configs/LVISv1-InstanceSegmentation/CLIP_fast_rcnn_R_50_C4_custom_img.yaml \ MODEL.WEIGHTS ./pretrained_ckpt/regionclip/regionclip_pretrained-cc_rn50x4.pth \ MODEL.CLIP.TEXT_EMB_PATH ./pretrained_ckpt/concept_emb/coco_80_cls_emb_rn50x4.pth \ MODEL.CLIP.OFFLINE_RPN_CONFIG ./configs/LVISv1-InstanceSegmentation/mask_rcnn_R_50_FPN_1x.yaml \ MODEL.CLIP.TEXT_EMB_DIM 640 \ MODEL.RESNETS.DEPTH 200 \ MODEL.ROI_BOX_HEAD.POOLER_RESOLUTION 18
@YiwuZhong Thanks for the fast reply! I don't remember the error I got in installations... Your annotated script seems useful for later.
Can you supply the exact command to use for ZS detection inference with your pretrained model on custom images & custom labels?
The one you supplied in the readme (below) didn't work because of the mismatch between the expected number of classes (in the yaml), and the actual one in the custom labels, and maybe there are more things to take into account - that's why I ask you...
!python3 ./tools/train_net.py \ --eval-only \ --num-gpus 1 \ --config-file ./configs/LVISv1-InstanceSegmentation/CLIP_fast_rcnn_R_50_C4_custom_img.yaml \ MODEL.WEIGHTS ./pretrained_ckpt/regionclip/regionclip_pretrained-cc_rn50x4.pth \ MODEL.CLIP.TEXT_EMB_PATH ./pretrained_ckpt/concept_emb/coco_80_cls_emb_rn50x4.pth \ MODEL.CLIP.OFFLINE_RPN_CONFIG ./configs/LVISv1-InstanceSegmentation/mask_rcnn_R_50_FPN_1x.yaml \ MODEL.CLIP.TEXT_EMB_DIM 640 \ MODEL.RESNETS.DEPTH 200 \ MODEL.ROI_BOX_HEAD.POOLER_RESOLUTION 18
Hi, I have the same problem, that is, how to detection inference pretrained model on custom images & custom labels . Did you get your previous problem solved, can you elaborate on how to do it, thanks a lot.
This issue should be addressed by simply specifying MODEL.ROI_HEADS.NUM_CLASSES
to the same number as your custom categories.
Hi guys, great work (!), but I'm not sure how to try your pretrained models (and maybe cite you) for other tasks that require zero-shot detection on custom objects (and custom images).
First, I followed your installation instructions and it didn't work (you might want to check...). I eventually succeeded installing in colab (avoiding the cuda/torch incompatibilities), with the right detectron2. Then I created concept embeddings for the objects I needed, but I didn't understand how to use them with your pretrained model. Can you explain/write the command?
Detic published a great simple colab notebook for trying out their model (I'm not related to them, just impressed). I'm sure that if you write a similar notebook you'll become much more attractive for others to use your code/models :-) Thanks!