wl-stepp / adaptive-imaging

Tools for adaptive imaging in the LEB-EPFL
0 stars 2 forks source link

Rotating images before train/test data split #2

Open bartolsthoorn opened 1 year ago

bartolsthoorn commented 1 year ago

Hello.

In the manuscript, it is noted that This dataset was enhanced 10-fold by rotating the images, to make up the final dataset (37000 images).

However, this is done before the train/test split is made! Is it therefore fair to say that it is very likely that the test dataset simply contains rotated versions of the training data?

If this is true, it is of course problematic because although U-Net is not rotationally invariant, it is not fair to say that the test data is really unseen data. For example, see the discussion here: https://stats.stackexchange.com/questions/412992/data-augmentation-on-entire-dataset-before-splitting

wl-stepp commented 1 year ago

Thank you for having such a close look and raising this point. You are right that ideally, the train/test split could have to be done before splitting. Although U-nets are rotationally invariant (and hence produce different outputs for rotated inputs), splitting before augmenting would avoid the possibility of some of the testing data leaking during training. We think however, that it is unlikely this had major downstream effects on event detection for event-driven acquisition since the detections did work on the live data to trigger events.