ItemZheng / KDDAug

[ECCV2022] Rethinking Data Augmentation for Robust Visual Question Answering
10 stars 0 forks source link

object detecting with faster rcnn example is not constant. #3

Closed pjy0422 closed 1 year ago

pjy0422 commented 1 year ago

image

Hi, your paper was really great and I want to reproduce your experiments results.

In process_original_dataset.py, default value of conf_threshold is set by 0.8.

With the value 0.8, I got Example with faster rcnn detection results: [{'q_id': 9001, 'img_id': 9, 'question': 'What color are the dishes?', 'answer_text': ['pink and yellow'], 'scores': [0.9], 'objects': [], 'attributes': []}]

with value 0.7, I got {'q_id': 9001, 'img_id': 9, 'question': 'What color are the dishes?', 'answer_text': ['pink and yellow'], 'scores': [0.9]}

with value 0.51, I got {'q_id': 9001, 'img_id': 9, 'question': 'What color are the dishes?', 'answer_text': ['pink and yellow'], 'scores': [0.9], 'objects': ['broccoli', 'donut', 'container'], 'attributes': ['green', '', 'plastic']}.

with value 0.4, finally I got same result as in the example.

With the default value 0.8, the training was carried out normally, but the results were too low. But After that, while proceeding with 0.4, a memory error occurred and learning could not proceed.

In conclusion, please tell me the values of attr_thresh and conf_thresh in process_original_dataset.py that you used to conduct the experiment.

ItemZheng commented 1 year ago

We used attr_thresh = 0.4 and conf_thresh = 0.8. The displayed sample is an error. The right is:

{
    'q_id': 9001,
     'img_id': 9,
     'question': 'What color are the dishes?',
     'answer_text': ['pink and yellow'],
     'scores': [0.9],
     'objects': [],
     'attributes': [],
     'nouns': ['dish'],
     'ori_nouns': ['dishes']
}
pjy0422 commented 1 year ago

Thank you for your kind reply. All the problems were solved well and the results were reproduced well.

I read your paper and found that UpDn, LMH, and LMH+CSS were used as backbone, but have you used any other models as backbone? For example, Ruby, CF-VQA, etc.

I want to apply your work to CF-VQA, but I don't know how to apply it. Where should I start?

Thank you.

ItemZheng commented 1 year ago

Prepare a pretrianed CF-VQA model, then finetune it on our augmented dataset. If CF-VQA use UpDn as backbone, you can directly use this command:

CUDA_VISIBLE_DEVICES=0 python aug_main.py --backbone ./path/to/model --aug_name all --dataset cpv2 --output [] --seed 0

If CF-VQA take other model as backbone, you have to modify the code (mainly the code about model architecture).

pjy0422 commented 1 year ago

Once again, thank you very much for your quick and kind reply. I think all the problems have been solved well.