Random patches for naive scaling?

mglowacki100 commented 1 year ago

Hi, Thanks, for this very interesting and unorthodox approach. I wonder if you've have tried to scale it up with random patches: https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.BaggingClassifier.html basically, as base estimator TabPFN, max_samples=1000, max_features=100, n_estimators could be tuned with e.g. early stopping, note taht n_jobs need to be 1 or google colab crashes. I know it is naive and mundane approach but could be good enough/competitve for some cases. Btw. even with data within constraints, subsampling of features could help a little with uninformative ones.

SamuelGabriel commented 1 year ago

Hi :)

We actually looked into this a while ago but where not very impressed with the results. But we did not spend a lot of time tuning it. There is this package that builds on ours that tries to make this work, but not sure how well it works yet: https://github.com/ersilia-os/ensemble-tabpfn

mglowacki100 commented 1 year ago

@SamuelGabriel Thanks for information, btw. tabpfn is also integrated now with autogluon https://github.com/autogluon/autogluon/pull/3270

automl / TabPFN

Random patches for naive scaling? #37