Baselines training program

OPMZZZ commented 1 month ago

Hello! It seems that the 'baselines' folder you mentioned contains three types of baseline networks. However, apart from the DAE, it appears that the other two DDPM-based baselines only have validation files. Could you please provide the training files as well? Thank you very much!

AlexanderFrotscher commented 1 month ago

Hey, sorry for not including them from the start. They are now uploaded with commit b30306f

OPMZZZ commented 1 month ago

Thank you for sharing the code! I attempted to train AnoDDPM on the BRATS21 dataset using the configuration file you provided, but the training process is quite slow. According to the config file, it’s taking nearly a week to complete 233 epochs. I’m curious to know if you achieved the AnoDDPM results on the BRATS21 dataset, specifically the AUPRC of 0.769 and a Dice score of 0.687, using the same configuration. I’m currently training on the BRATS20 dataset with a similar proportion of training sets, and I noticed that your ANDI model achieved almost identical results to BRATS21. However, AnoDDPM's performance seems to be lagging, with an AUPRC of 0.4859 and a Dice score of 0.40143. I’d appreciate any insights you might have!

AlexanderFrotscher commented 1 month ago

Hey, the AnoDDPM model is quite hard to train and it took us also a huge amount of time. My supervisor trained the model and I gave you the hyperparamters in the .yml script according to his weights and biases run. To speed up the training he calculated a tensor of simplex noise with the batch size equal to 10k in the beginning and then never sampled the simplex noise again. Then he used the random vectorized transform function to rotate and change the noise to reintroduce the "stochasticity" for each sample in the batch. He also used 2 gpus, each of them had a batch size of 64. This is similar to the AnoDDPM eval function where the complete noise for the diffusion backwardpass is calculated before. The problem is, that the original model was trained on a batch size of 1 and only on one specific slice location and therefore the authors did not run into this kind of problems. Yes, the results were achieved with this model and the eval script in the github repo. After an extensive amount of training this model can achieve results like this.

OPMZZZ commented 1 month ago

Thank you very much for your patient answer!

AlexanderFrotscher / ANDi

Baselines training program #2