xiaoman-zhang / PMC-VQA

PMC-VQA is a large-scale medical visual question-answering dataset, which contains 227k VQA pairs of 149k images that cover various modalities or diseases.
MIT License
172 stars 11 forks source link

How to run as an API for specific image and task? #14

Open nhandang-ai4ia opened 1 year ago

nhandang-ai4ia commented 1 year ago

Hi, thanks for the great work. When i tried test.py on MedVInT_TD, the dataloader return key error Caption. I have checked the download data and see no column name Caption. I try to approach as wrap it as API which receive images and prompt as input and get the results. However, i am still lost at what is input_ids. Could you elaborate more on this, and how we can use this to serve maybe as FastAPI.

nhandang-ai4ia commented 1 year ago

Hi, i mange to run, silly me. The dataset version 2 got a column name Caption. From there i understand the input_ids. The problem is the pretrained model, the error and the current fix is described in #5 . The generated text from the model is awkward. Here is one example: al CT 221 CT CT CT\u2009 CT\u2009: oftery is occased? comlaced by to the CT scanmonary angiogram? ose: ort Right pul pulary artery : umflex artery C:Right coronary artery D:Left anterior descending arary artery arrow: C I suspect the rename and key deletion in model checkpoint cause this behaviour. Could the authors help check the pretrained model is up to date? In the file PMC_QA_Dataset.py, line 95, there is a small typo error: sample['Answer'] not sample['Anwser']. Thank you