YifanXu74 / MQ-Det

Official PyTorch implementation of "Multi-modal Queried Object Detection in the Wild" (accepted by NeurIPS 2023)
Apache License 2.0
256 stars 11 forks source link

Function Request: Demo of finetuning-free zero(few)-shot single image inference #8

Open TianpengBu opened 10 months ago

TianpengBu commented 10 months ago

Hi Author,

Great work! I'm wondering if you have any plan on developing a demo showing few-shot ability of the model.

Specifically, we can prepare some vision queries (a few examples with novel classes). And with these vision queries, we perform object detection of an input image from the same domain as the vision queries. I'm excited to seeing the performance of MQ-Det in this scenario. (Since, I blieve that with VisionAndText queries, MQ-Det can definitely perform better! :))

The function I propsed is very similar to the playground demo of OWL-ViT (link below).

https://colab.research.google.com/github/google-research/scenic/blob/main/scenic/projects/owl_vit/notebooks/OWL_ViT_inference_playground.ipynb

YifanXu74 commented 10 months ago

Hi @TianpengBu , Thank you for your constructive suggestion! Actually, I am currently contemplating a demo plan. However, as an individual developer, I don't have enough time to proceed with this plan in the immediate future. I will carefully consider your suggestion to develop a demo during my available spare time.