dino-chiio / blip-vqa-finetune

This is implementation of finetuning BLIP model for Visual Question Answering
42 stars 6 forks source link

More info on the dataset used and finetuning results #1

Closed SharmaSubham975 closed 9 months ago

SharmaSubham975 commented 9 months ago

Can you please share more info on the dataset used for training and some of the results

dino-chiio commented 9 months ago

I have uploaded the data here

The test set does not have a label because I evaluated it on the Kaggle competition of my course. The accuracy I got was 96%.

uttam-scai commented 7 months ago

Thank you for your dataset. I downloaded your data and i have one question - Have you created one folder for each image? or does the gap (folder 1, 3, 33) in the dataset means I can keep many images of same type in the same folder? I have added a screenshot for clear understanding. Thank you !!!

image

dino-chiio commented 7 months ago

Thank you for your dataset. I downloaded your data and i have one question - Have you created one folder for each image? or does the gap (folder 1, 3, 33) in the dataset means I can keep many images of same type in the same folder? I have added a screenshot for clear understanding. Thank you !!!

image

Each folder (number) contains only one image. It means that it is one data sample. You can see the training data information stored in the .json file. It is loaded in the code.