nice work! I did some tests and found some codes in file "models/detector.py" could cause some problem:
global_entity_list = [] # save all the entity type name for each sentence.
for entity_str in extracted_entities:
# border case: nothing to extract
if 'none' in entity_str.lower():
continue
entity_list = entity_str.split('.')
for ent in entity_list:
global_entity_dict.setdefault(ent, {}).setdefault('total_count', 0)
global_entity_dict.setdefault(ent, {}).setdefault('crop_path', [])
global_entity_dict.setdefault(ent, {}).setdefault('bbox', [])
global_entity_list.append(entity_list)
when an entity is 'none', the 'global_entity_list' will not include the 'none' entity, which will result in wrong index order in 'global_entity_list' and 'sample['split_sents']' in file "models/questioner.py":
def generate_questions(self, sample: Dict):
sentences = sample['split_sents']
global_entity_dict = sample['entity_info']
global_entity_list = sample['entity_list']
qs_list = []
num_calls = len(sentences)
print(f'generate ques will call llm {num_calls} times')
for ent_list, sent in zip(global_entity_list, sentences):
exist_entity = [ent for ent in ent_list if ent in global_entity_dict and global_entity_dict[ent]['total_count'] > 0]
# border case: no detection result for any entity. no question asked.
if len(exist_entity)==0 :
qs_list.append([])
continue
questions = get_res(self.nlp, '.'.join(exist_entity), sent)
qs_list.append(questions)
by the way, how will the performance of VQA model affect the woodpecker performance? I changed the GPT-3.5 to llama3 and I understand the llm model will play an import role. But for the VQA model, did try other models? @xjtupanda @BradyFU
Sorry for the late reply; we've been very busy lately.
For the bug, could you please make a PR so we can fix that?
As reported in the paper, the VQA model mainly impacts attribute recognition (color), while it does not perform well on object recognition, and that's why we introduce a detection model. We haven't tried other VQA models since BLIP-2 was already SOTA at that time.
nice work! I did some tests and found some codes in file "models/detector.py" could cause some problem:
when an entity is 'none', the 'global_entity_list' will not include the 'none' entity, which will result in wrong index order in 'global_entity_list' and 'sample['split_sents']' in file "models/questioner.py":
by the way, how will the performance of VQA model affect the woodpecker performance? I changed the GPT-3.5 to llama3 and I understand the llm model will play an import role. But for the VQA model, did try other models? @xjtupanda @BradyFU