Open power0341 opened 7 months ago
Sorry for the late reply. The v2 supplement is planned to be released after the acceptance. Your question about its general capabilities will be answered in our new version due in the next two months. Thanks for your interest.
In the v2 paper, you mentioned several times that details of dataset and model specifics were revealed at the supplements but the arxiv version of the paper appeared to have no such a section, could you point out where could i find these appendix? by the way, given such a high volume of object detection data, it seems the intention is to have the model tailored the REC/REG tasks, sorta like an open vocab detecor, can it still do the lvlm things, can it answer the question "What is unusual about this image?", if that's the case, what's your insight on using a quite capable LLM like llama2 13B,
great work again and look forward to the future release