models might overfit when there are not enough datapoints relative to the number of variables. We should check good rules of thumb and give a warning if this is the case
some models, in particular GP's, get very slow and compute intensive with too many samples. We should give a warning when the data is too big. Even better, we could provide a drop in replacement for the scikit-learn GP that deals with larger datasets (probably at the cost of accuracy). This is an active area of research but there might be solutions.
scikit-learn
GP that deals with larger datasets (probably at the cost of accuracy). This is an active area of research but there might be solutions.Thanks to @MaxBalmus for the feedback.