MILVLG / openvqa

A lightweight, scalable, and general framework for visual question answering research
Apache License 2.0
318 stars 64 forks source link

Custom training data where the answer is sentence #62

Closed ihabZhaika closed 4 years ago

ihabZhaika commented 4 years ago

Hey @MIL-VLG , I want to get your advice if possible I was able to generate .npz files and to run the project, but i am facing weird issue I have some medical data consists of Q and A, each of them is sentence in the original VQA most of the answers are single word/number or something around, i am running the mcan_small, on the accuracy i always get 0.0 after inspecting i see that i have irrelevant predicted answers and 99% of the answers is the same, On the training i have very very small loss which makes me wonder how i get small loss(0.0006) and the answers are that much irrelevant

raw Data sample: synpic45783|what is abnormal in the x-ray?|hydroxyapatite crystal deposition disease Prepossessed data sample to fit the format[same sample] In the questions { "question_id": 183, "question": "what plane is demonstrated?", "image_id": 45783 }, In the annotations { "question_id": 130, "answers": [ { "answer_id": 1, "answer": "hydroxyapatite crystal deposition disease", "answer_confidence": "yes" } ], "answer_type": "other", "image_id": 45783, "question_type": "what is" },

Example for predicted answer[same prediction for all the validation set] {'image_id': 28204, 'answer_type': 'other', 'question_id': 0, 'question_type': 'was', 'answer': 'model'} What do you think i can do so overcome this issue ?

MIL-VLG commented 4 years ago

Currently, we do not support such generation-based VQA models since the three benchmark datasets we provided have short answers and are usually regarded as a classification problem.

Since this may result in a huge framework modification, we do not have such a plan to implement it now. If you want to do that, you may need to do this by yourself. Otherwise, can you preprocess the answers in your dataset to short ones and use the classification framework we provided?

ihabZhaika commented 4 years ago

Currently, we do not support such generation-based VQA models since the three benchmark datasets we provided have short answers and are usually regarded as a classification problem.

Since this may result in a huge framework modification, we do not have such a plan to implement it now. If you want to do that, you may need to do this by yourself. Otherwise, can you preprocess the answers in your dataset to short ones and use the classification framework we provided?

Hey, By short you mean one word answer ?

MIL-VLG commented 4 years ago

Not exactly. The answers can be longer than one word, e.g., blue and while or dark blue. But we still regard it as a classification problem by using the high-frequency answers (e.g., occur 8 times in the training set. ) as the answer vocabulary.