zhjohnchan / M3AE

[MICCAI-2022] This is the official implementation of Multi-Modal Masked Autoencoders for Medical Vision-and-Language Pre-Training.
111 stars 10 forks source link

Missing logger folder: result/task_pretrain_m3ae-seed0-from_ #2

Closed trx14 closed 1 year ago

trx14 commented 1 year ago

Hi, Thanks for your excellent work and share this awesome repo. When I run your code, I found a warning "Missing logger folder: result/task_pretrainm3ae-seed0-from" I checked the /result folder and it was empty. Do you have any suggestions for it? Thanks!

zhjohnchan commented 1 year ago

Hi there,

Thanks for your attention!

You can ignore that. It just tells us it will create the folder and store the results there.

Best, Zhihong

trx14 commented 1 year ago

Thanks for your quick reply. I run the experiments on 4 A100 GPUs and changed the GPU nums to 4 in the shell. However, the program is stuck at the following step for 20 mins (see below figure), and the process didn't use the GPUs. Do you have any suggestions? Thanks!

Screen Shot 2022-09-21 at 00 30 36
zhjohnchan commented 1 year ago

Hi, here are some suggestions: (1) you can use 1 GPU for debugging so you can insert traces in your program; (2) then you can find the place it stuck. From my experience, it may be loading the data, so you can check if it is stuck by the memory limit.

trx14 commented 1 year ago

Thanks for your suggestions, I run the experiments for VL-pertaining and fine-tune on VQA-SLAKE and IRTR-ROCO. Because of the memory limit, I only use ROCO for VL-pretraining. For SLAKE, I got a result of Open: 77% and Closed 85%. For IRTR-ROCO, I change the fine-tuning batch size to 20, I got a result for ir_r1: 6.5%, ir_r5: 21.7%, ir_r10: 33.5%, tr_r1: 7.0%, tr_r5:23.2% tr_r10: 34.9%. The IRTR performance is significantly lower than the paper. One explanation could be I only use ROCO for pretraining and the fine-tuning batch size is significantly smaller than your setting (which is 4096). I think your pretraining model can be a great baseline for many medical domain downstream tasks, and I really want to build some models based on your pretraining model, do you have any plan to open-source them? Thanks!

zhjohnchan commented 1 year ago

Hi, I think ROCO is too small to perform pre-training and try MedICaT together. For the pre-trained models, find them in the README, where I have already open-sourced.