Closed BSchilperoort closed 2 years ago
leave-n-out is basically a k-fold with shuffle is false and k = total # of years / n.
At some point, it would also be nice to have the functionality to skip n-train years that are adjacent to a test year. Highly autocorrelated timeseries may lead to information leakage from one year to the next. To ensure this effect is reduced/negated, you can remove the years adjacent to test years from the training datesets.
See gap_prior and gap_after arguments in legacy code: https://github.com/AI4S2S/proto/blob/77734930a40b8aaefcf1e390efe0e3ac93b40858/RGCPD/class_RGCPD.py#L312-L314 (edited)
Is this issue closed by #53 ?
Is this issue closed by #53 ?
I believe #53 will close this issue yes. It seems that all legacy splitting methods are indeed supported by the code in that PR.
The new train/test implementation relies on sklearn for the splitter classes. These mostly correspond to the legacy code methods (see image below)
@semvijverberg could you describe here what the
leave
splitter is supposed to do?