Sub-dataset selection method

To train this contrast classifier, I have access to a vast dataset sourced from NeuroPoly servers and OpenNeuro. To maximize the utility of this data, I aim to create a balanced and diverse dataset to develop a robust model. I particularly want the model to learn the relationship between image content and contrast, rather than the specific characteristics of my sub-dataset and the contrast (such as resolution, orientation, framing).

- Balance among contrasts will be ensured by assigning weights relative to their representation in the dataset (upsampling).

- Data augmentation will simulate variations in framing, orientation, and resolution through random crops, rotations, and downscalings.

- I will estimate the dataset's bias based on different characteristics by evaluating the performance of basic classifiers trained exclusively with these data. The worse these classifiers, the better the dataset.

ivadomed / Bidsification

Sub-dataset selection method #2