sct-pipeline / contrast-agnostic-softseg-spinalcord

Contrast-agnostic spinal cord segmentation project with softseg
MIT License
4 stars 3 forks source link

For final model, do hyperparam optimization #40

Closed jcohenadad closed 11 months ago

jcohenadad commented 1 year ago

Ideally with wandb sweep: https://github.com/ivadomed/ivadomed/issues/1124

naga-karthik commented 1 year ago

I didn't do a wandb sweep per se but I played with the following important hyperparameters during my experiments:

  1. Batch-size --> [2, 4, 8]
    • Ideally, the smaller the batch-size the better the generalization (see abstract of this paper). Batch-size of 2 could fit on romane's GPU so I fixed batch-size to 4.
  2. Optimizer --> [Adam, SGD]
    • I tried SGD with a learning rate of 0.01 (same as nnUNet's) but I noticed that the performance was not close to Adam with learning rate of 1e-3. SGD was also taking a longer time to converge than Adam.
  3. Model --> Monai's UNet, ivadomed's Modified 3D UNet
    • This is basically comparing different implementations of the UNet model. Monai's model was simply not powerful enough compared to ivadomed's (dice score difference of 0.1-0.2 between the two models). Hence, I fixed the model to the ivadomed's implementation
  4. Patch-size --> [64x128x64, 80x192x160, 160x224x96]
    • This is the single most important hyperparameter. In general, smaller patch sizes tended towards more CSA bias compared to models with larger patch sizes. Internal progress reports show the CSA plots.
    • NOTE: These values are chosen by keeping the median patch size 192x228x106 as the reference (i.e. the tested patch size are close to some multiple of the median patch sizes).
    • In view of these results, the patch-size is fixed to 160x224x96.
jcohenadad commented 1 year ago

i suggest adding the info above to the manuscript method section before we forget

naga-karthik commented 11 months ago

We have a close-to-final version (just need to run on more seeds now) with all hyperparam tuning already done. Those hyperparms are described in the manuscript, hence closing!