AILab-CVC / YOLO-World

[CVPR 2024] Real-Time Open-Vocabulary Object Detection
https://www.yoloworld.cc
GNU General Public License v3.0
4.43k stars 430 forks source link

Text input #315

Closed Czeni1 closed 4 months ago

Czeni1 commented 4 months ago

I would like to ask if the sentence entered in this text needs to be manually entered by oneself, or is it included in the code?Because I want to use this multimodality in closed set object detection, I want to train and test like Yolo, and I don't understand where this text comes from? Looking forward to your reply!

afc06bb2a43bf6ecc90efb906cac1ff
LaplaceSama commented 4 months ago

try this command. python image_demo.py path/to/config path/to/weights image/path/directory 'person,dog,cat' --topk 100 --threshold 0.005 --output-dir demo_outputs

dq1125 commented 4 months ago

请尝试此命令。 python image_demo.py path/to/config path/to/weights image/path/directory 'person,dog,cat' --topk 100 --threshold 0.005 --output-dir demo_outputs

你好,我也想问一下,这个模型输入可以是一句话吗,就像模型中那样输入一句话和一张图片,框出句子里包含的物体,还是说输入只能是单个词汇,这种孤立的词,如果是一句话话应该怎么输入到模型中呢

wondervictor commented 4 months ago

Hi @dq1125, currently, the demo only supports separate noun phrases, words or object captions. If you want to input a complete caption with several objects, you can use the below scripts to obtain separate noun phrases before feed it into YOLO-World:

import string
import nltk
from nltk import word_tokenize, pos_tag

nltk.download('punkt')
nltk.download('averaged_perceptron_tagger')

def extract_noun_phrases(text):

    tokens = word_tokenize(text)
    tokens = [token for token in tokens if token not in string.punctuation]
    tagged = pos_tag(tokens)
    print(tagged)
    grammar = 'NP: {<DT>?<JJ.*>*<NN.*>+}'
    cp = nltk.RegexpParser(grammar)
    result = cp.parse(tagged)

    noun_phrases = []
    for subtree in result.subtrees():
        if subtree.label() == 'NP':
            noun_phrases.append(' '.join(t[0] for t in subtree.leaves()))

    return noun_phrases
dq1125 commented 4 months ago

Hi @dq1125, currently, the demo only supports separate noun phrases, words or object captions. If you want to input a complete caption with several objects, you can use the below scripts to obtain separate noun phrases before feed it into YOLO-World:您好,目前该演示仅支持单独的名词短语、单词或对象标题。如果您想输入包含多个对象的完整标题,您可以使用以下脚本获取单独的名词短语,然后将其输入 YOLO-World:

import string
import nltk
from nltk import word_tokenize, pos_tag

nltk.download('punkt')
nltk.download('averaged_perceptron_tagger')

def extract_noun_phrases(text):

    tokens = word_tokenize(text)
    tokens = [token for token in tokens if token not in string.punctuation]
    tagged = pos_tag(tokens)
    print(tagged)
    grammar = 'NP: {<DT>?<JJ.*>*<NN.*>+}'
    cp = nltk.RegexpParser(grammar)
    result = cp.parse(tagged)

    noun_phrases = []
    for subtree in result.subtrees():
        if subtree.label() == 'NP':
            noun_phrases.append(' '.join(t[0] for t in subtree.leaves()))

    return noun_phrases

谢谢你,我使用你发的这个脚本对输入的语句获取了单独的名词短语,我输入的句子是“Please help me find a red screwdriver”,提取后的短语为 "a red screwdriver",我自己制作的数据集是coco类型的,类别里面是包含有red screwdriver,可以直接把"a red screwdriver"这个提取到的名词短语输入到yoloworld模型中吗,如果我打的标签类别标签就是red screwdriver,yoloworld模型是否能“通过a red screwdriver”找到我的类别“red screwdriver”并框出来,如果我coco数据集中还有其他类别标签的螺丝刀比如“yellow screwdriver”,yoloworld模型是否能够区分不同的颜色“red screwdriver”和“yellow screwdriver”,模型会不会将输入的名词短语"a red screwdriver"处理最终只留下孤立的一个词“screwdriver”,只通过单一的名词“screwdriver”来定位物体,这样的话我打的标签会不会就没用了?

wondervictor commented 4 months ago

@dq1125 你说的这种情况,"a red screwdriver"与“red screwdriver”都能提取出来。YOLO-World目前是有颜色区分能力的,经过预训练后的YOLO-World能够根据不同的颜色识别不同的物体,也就是说“yellow screwdriver”和“red screwdriver”会检测相应颜色的物体。

dq1125 commented 4 months ago

@dq1125 你说的这种情况,"a red screwdriver"与“red screwdriver”都能提取出来。YOLO-World目前是有颜色区分能力的,经过预训练后的YOLO-World能够根据不同的颜色识别不同的物体,也就是说“yellow screwdriver”和“red screwdriver”会检测相应颜色的物体。

好的,非常感谢你的解答!

wondervictor commented 4 months ago

This issue will be closed since there is no further update related to the main topic. Besides, the error has been fixed already. Thanks for your interest. If you have any questions about YOLO-World in the future, you're welcome to open a new issue.