soinx0629 / vis_dec_neurips

Official Implementation for NeurIPS'23 paper Contrast, Attend and Diffuse to Decode High-Resolution Images from Brain Activities
40 stars 7 forks source link

how many gpus are need to train your model in both phrases #2

Closed xmu-xiaoma666 closed 6 months ago

xmu-xiaoma666 commented 9 months ago

thanks for your great job and code! how many gpus is needed to train your model in both phrases

AmingWu commented 9 months ago

How long does your model need to train?

soinx0629 commented 9 months ago

We take training on the GOD dataset as an example. For the fMRI representation learning model, we train Phase 1 for 150 epochs and Phase 2 for 60 epochs on one Nvidia A100 GPU. The two phases in total take about 12 hours. After the two phases, we only save the checkpoint of the fMRI encoder. For the diffusion model, we use the pre-trained label-to-image latent diffusion model. The model has 401.32M parameters, but during finetuning, we only tune the weights in the cross-attention layers and the fMRI encoder. We use one Nvidia-V100 GPU to finetune the model for 500 epochs and it takes around 24 hours.