taokz / BiomedGPT

BiomedGPT: A Unified and Generalist Biomedical Generative Pre-trained Transformer for Vision, Language, and Multimodal Tasks
Apache License 2.0
364 stars 34 forks source link

No such file or directory: '/data/omnimed/pretrain_data/negative_sample/type2ans.json' #19

Closed JennDong closed 4 months ago

JennDong commented 5 months ago

I encounter the following error when evaluating the pretrained model with evaluate_vqa_pretrained_beam_scale.sh. Whereas it works fine for the finetuned models. I don't know what this type2ans.json file should be and how it is generated.

Traceback (most recent call last): File "/.conda/envs/biomedgpt/lib/python3.7/runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "/.conda/envs/biomedgpt/lib/python3.7/runpy.py", line 85, in _run_code exec(code, run_globals) File "/BiomedGPT/evaluate.py", line 162, in cli_main() File "/BiomedGPT/evaluate.py", line 157, in cli_main cfg, main, ema_eval=args.ema_eval, beam_search_vqa_eval=args.beam_search_vqa_eval, zero_shot=args.zero_shot File "/BiomedGPT/fairseq/distributed/utils.py", line 389, in call_main main(cfg, kwargs) File "/data2/dongwenjie/BiomedGPT/evaluate.py", line 84, in main num_shards=cfg.checkpoint.checkpoint_shard_count, File "/BiomedGPT/utils/checkpoint_utils.py", line 447, in load_model_ensemble_and_task task = tasks.setup_task(cfg.task) File "/BiomedGPT/fairseq/tasks/init.py", line 46, in setup_task return task.setup_task(cfg, kwargs) File "/BiomedGPT/tasks/ofa_task.py", line 111, in setup_task return cls(cfg, src_dict, tgt_dict) File "/BiomedGPT/tasks/pretrain_tasks/unify_task.py", line 97, in init self.type2ans_dict = json.load(open(os.path.join(self.cfg.neg_sample_dir, 'type2ans.json'))) FileNotFoundError: [Errno 2] No such file or directory: '/data/omnimed/pretrain_data/negative_sample/type2ans.json'

JennDong commented 5 months ago

Still get another error. "AssertionError: Error: The local datafile /data/omnimed/processed_data/image/omnimed_image.tsv not exists!" How can I prepare this omnimed dataset? And how can I change this path?

taokz commented 4 months ago

Could you please share the complete script and the error log with me? I'm puzzled by the errors as they appear to be related to the pretraining phase, rather than the evaluation/inference phase. Additionally, I'm unclear about the reference to "evaluate_vqa_pretrained_beam_scale.sh", and you meant to refer to the modified "evaluate_vqa_beam_scale.sh" by yourself?

taokz commented 4 months ago

Or you may want to evaluate the model in the zero shot setting, if it is, you can directly set "--zero-shot \" in the evaluation script, for example:

CUDA_VISIBLE_DEVICES=0 python3 -m torch.distributed.launch --nproc_per_node=1 --master_port=${MASTER_PORT} ../../evaluate.py \
                ${data} \
                --path=${path} \
                --user-dir=${user_dir} \
                --task=vqa_gen \
                --selected-cols=${selected_cols} \
                --bpe-dir=${bpe_dir} \
                --patch-image-size=480 \
                --prompt-type='none' \
                --batch-size=8 \
                --log-format=simple --log-interval=10 \
                --seed=7 \
                --gen-subset=${split} \
                --results-path=${result_path} \
                --fp16 \
                **--zero-shot \**
                --beam=${beam_size} \
                --unnormalized \
                --temperature=1.0 \
                --num-workers=0 \
                > ${log_file} 2>&1

Please feel free to let me know if it solve your issue.

JennDong commented 4 months ago

"evaluate_vqa_pretrained_beam_scale.sh" is my modified "evaluate_vqa_beam_scale.sh". I want to evaluate the pretrained model on VQA without finetuning. Changing to "--zero-shot \" works. But the score is quite low, below 0.1.

taokz commented 4 months ago

The issue should be related to the uppercase of the generation as mentioned in issue #6. To address this, I have been using the 'test_predict.json' file generated during evaluation and calculating accuracy by converting all uppercase characters in the gold answers to lowercase, and get the result reported in the paper.

JennDong commented 4 months ago

Thank you for the quick reply. I am able to get an exact match score at ~0.3.