Open rohanbanerjee opened 7 months ago
@rohanbanerjee I would like to fine-tune as well, mostly because it would be more ressource efficient (less epochs). When you fine-tune, do you mean training on the bigger updated dataset or only on the new data?
Also, I don't see why the fine-tuning strategy is not reproducible. You could simply provide the initial weights and anyone could re-train your model, no?
Thanks @rohanbanerjee for opening a discussion on this!
I observed that in a few cases the model was not segmenting the first and the last slice.
Wait, is this also the case with the nnunet model ? Because I am seeing something similar with the contrast-agnostic model's inference. Could you please confirm that you have trained and tested using the nnunet model?
Improves the short comings observed in the inference of the baseline models
By this, you mean that the model that was used for finetuning (initialized with the baseline model's weights) produces better segmentations at test time? i.e. first and last slices are properly segmented?
Not reproducible (as mentioned in the Pro of the Retraining)
I don't think I agree with this. As Armand suggested, it is much easier to fine-tune on the new data (given, the weights of your pretrained model) rather than collecting all the data (i.e. from various active learning rounds) and then re-training everything. Plus, a benefit is that fine-tuning takes less epochs than re-training everything from scratch.
BUT, before concluding that fine-tuning is the way to proceed in your case, consider this experiment: Fix a test set (called it Test Set A) and compare the performance of the: (1) baseline model on Test Set A, (2) Fine-tuned model (on Train Set B) initialized with baseline model's weights on Test Set A and (3) New model retrained on Train Set A and Train Set B.
If the fine-tuned model (2) is performing well than the new model re-trained on both Train Sets A and B (3), then you can proceed with fine-tuning for your future rounds of active learning. let me know if this makes sense, i'd be happy to clarify further!
I am opening this issue to have a discussion around retraining vs fine-tuning models across different active learning rounds.
Context:
I have trained a baseline model (ref #34) and now I am moving on to my next round of training (will be called round 1 from now on) for which I will use 30 subjects. I have used two different strategies for this round of re-training,
Pros:
Cons:
sub-nwMW07 (from Northwestern Motor Weber)
The observed problem in Retraining was solved in fine-tuning
Cons:
I gave one example of how I am observing that the fine-tuning is better than retraining but I would like to hear if anyone had a different experience or anything else that I should keep in mind. P.S. I do understand that this strategy depends on the type of region of interest but I still wanted inputs. tagging @valosekj @naga-karthik @plbenveniste @Nilser3 @hermancollin