Open oplatek opened 4 years ago
https://github.com/PyTorchLightning/lightning-Covid19/pull/10#discussion_r394055265 is we shall overwrite it in the later stage but for now, we can stay with this dataloader
The problem is that images from a single patient can be in both (all) train, valid (test) sets.
Possible data leakage? On the original dataset, there are several images from the same patient see for example patient number 2 Should we take this into account when splitting the data?
Originally posted by @shpotes in https://github.com/PyTorchLightning/lightning-Covid19/pull/10