Closed tdhock closed 1 week ago
Hi toby. Yes that makes a lot of sense.
First, let me APOLOGIZE SERIOUSLY that I didn't reply to your last email like a total arsehole. I was overworked. Will try to be better now.
Do you wanna continue here or get a mattermost invite, which might be easier...
Hi Bernd thanks for your message. No problem about my last email. I'm sorry to hear you were overworked, and I hope you are feeling better now. I have never used mattermost, but if that is what you prefer, I can give it a try. Email and github issues are best for me.
GRRRRR. i am already slow here. lets email and I will also put a link there
I think we switched to email
hi all! I plan to work on a research project in Paris from Jan to June 2025. During that time I wonder if it would be possible/interesting for you if I come to visit the mlr3 team and give a research presentation? Where would make sense? From github user pages I see Leipzig, Berlin, Munich...?
https://arxiv.org/abs/2410.08643 https://github.com/tdhock/mlr3resampling
Title: SOAK: Same/Other/All K-fold cross-validation for estimating similarity of patterns in data subsets
Abstract: In many real-world applications of machine learning, we are interested to know if it is possible to train on the data that we have gathered so far, and obtain accurate predictions on a new test data subset that is qualitatively different in some respect (time period, geographic region, etc). Another question is whether data subsets are similar enough so that it is beneficial to combine subsets during model training. We propose SOAK, Same/Other/All K-fold cross-validation, a new method which can be used to answer both questions. SOAK systematically compares models which are trained on different subsets of data, and then used for prediction on a fixed test subset, to estimate the similarity of learnable/predictable patterns in data subsets. We show results of using SOAK on six new real data sets (with geographic/temporal subsets, to check if predictions are accurate on new subsets), 3 image pair data sets (subsets are different image types, to check that we get smaller prediction error on similar images), and 11 benchmark data sets with predefined train/test splits (to check similarity of predefined splits).