taokz / BiomedGPT

BiomedGPT: A Unified and Generalist Biomedical Generative Pre-trained Transformer for Vision, Language, and Multimodal Tasks
Apache License 2.0
364 stars 34 forks source link

RuntimeError: result type Float can't be cast to the desired output type Long #16

Closed ACTT0 closed 5 months ago

ACTT0 commented 6 months ago

Hello, thank you for your wonderful project. I have a question. When I use your pre-trained weight biomedgpt_tiny.pt to finetune on dataset VQA-RAD, I have this error in ema.

File "/usr/local/lib/python3.10/dist-packages/fairseq/models/ema/ema.py", line 197, in step self._step_internal(new_model, updates) File "/usr/local/lib/python3.10/dist-packages/fairseq/models/ema/ema.py", line 171, in _step_internal emaparam.mul(decay) RuntimeError: result type Float can't be cast to the desired output type Long

I use numpy version 1.23.1 and Pytorch 2.1.2. I also use the latest fairseq version.

taokz commented 6 months ago

My code currently does not support PyTorch 2.xx. In addition, to ensure compatibility with Fairseq, please remember to use pip version 21.2.4 to install the required dependencies. You can easily set up the required environment by using the 'biomedgpt.yaml' configuration file. Alternatively, you have the option to install PyTorch 1.13.xx directly to meet the compatibility requirements.

ACTT0 commented 6 months ago

Thank you very much, I have fixed and run the finetuning on VQA-RAD successfully. After having the file tiny.log (after evaluating), how can I know the performance of the model ?

I use all the setting the same as you. This is the file tiny.log: tiny.log

taokz commented 6 months ago

You can see the result in

2024-02-06 19:47:08 | INFO | ofa.evaluate | score_sum: tensor([8.], device='cuda:0'), score_cnt: tensor([451.], device='cuda:0'), score: 0.0177

The score reflects accuracy, but I'm uncertain which checkpoint you utilized; the performance appears to be too low. Have you fine-tuned the pre-trained model by yourself? If it is, you may need to double check the configurations and adjust the hyper-parameters.

ACTT0 commented 6 months ago

I use your weight named biomedgpt_tiny.pt to fine-tune on VQA-RAD and do not change any setting except the number of GPU I use is 1 (A100) instead of 4 gpus. Do I need to increase the batchsize?

The file weight I downloaded from this link: https://www.dropbox.com/sh/cu2r5zkj2r0e6zu/AADZ-KHn-emsICawm9CM4MqVa?dl=0

These are two files .sh I use to train and inference on VQA-RAD (I change .sh to .txt because I cannot upload file .sh): train: train_vqa_rad_beam_scale.txt inference: evaluate_vqa_rad_beam_scale.txt

And when I use your checkpoint of base-scale model for VQA-RAD to evaluate, I only get score = 0.3038

taokz commented 5 months ago

@ACTT0 Hi, sorry for the late response, previous checkpoint was uploaded by mistake as mentioned in Issue #6, I provided the correct checkpoint.

BTW, fine-tuning VQA-RAD in general requires more training steps due to the not-good aligned answers for open-end questions. I have tried to set generation with all candidates instead of beam search via val_inference_type=allcand, and got the better results.