Strategy for model training

My understanding of how model was trained in this project is the following. Let's consider dataset of 100 subjects.

model 1: training on subjects (0:19)
applied model 1 on subjects (20:99)
- note: ground truths for (0:19) were not updated
selected best inferences, let's say, subjects (20:49)
manually correct GT for (20:49)
model 2a: fine tuning: model 1 --> model 2a using ONLY (20:49)
model 2b: training using subjects (0:49)

According to @rohanbanerjee, model 2a performs better than model 2b (as we discuss in issue #36)

However, one risk, is model drift towards images with 'bad' quality, given that as the new training rounds increase, the data quality is shifting towards 'bad' cases (ie: the 'good cases' were used for model 1, and might now be forgotten). We need a validation strategy to ensure this does not happen @rohanbanerjee

My suggestion (Julien writing here):

model 2c: fine tuning: model 1 --> model 2c using subjects (0:49)

Sources:

issue #35
issue #36

sct-pipeline / fmri-segmentation

Strategy for model training #48