Closed Innixma closed 2 years ago
Note that the current pip install TabPFN
installs the following:
Successfully installed configspace-0.6.0 gpytorch-1.8.1 hyperopt-0.2.7 liac-arff-2.5.0 minio-7.1.12 openml-0.12.2 py4j-0.10.9.7 tabpfn-0.1.5 xmltodict-0.13.0
I don't think configspace
, openml
, or hyperopt
should be necessary (at a minimum). configspace
is not supported on Windows, and if I were to seriously try integrating this into a system like AutoGluon, TabPFN would need to have only the minimum dependencies required to function to avoid bloat.
Hi there :)
setup.py
was very bloated indeed. We switched to a simplified setup.py
, using the pyproject.toml
. So now we only install the same things as when installing through pip install tabpfn
, when using pip install -e .
. Thanks for pointing this out.
See https://github.com/automl/TabPFN/blob/main/setup.py
Reducing the dependencies further would make sense, too, we believe. Right now the dependencies are chosen such that you can re-train your own TabPFN, but probably most users need less.
You could use extra_dependencies functionality of setup.py to have pip install TabPFN
be the minimal inference dependencies and pip install TabPFN[train]
include the full dependencies.
I am mostly interested in using the model purely for inference, not for training.
Sounds good! Most people are, I guess. I will let you know when we have progress on separating concerns in our dependencies.
I could remove most non-standard dependencies without a lot of re-structuring and updated the pip package accordingly. We are now down to the following dependencies:
dependencies=[
'gpytorch>=1.5.0',
'torch>=1.9.0',
'scikit-learn>=0.24.2',
'pyyaml>=5.4.1',
'numpy>=1.21.2',
'requests>=2.23.0',
]
Gpytorch could also be removed in a second step with a set of a little deeper changes, but not sure that is crucial. What do you say?
That is awesome! If gpytorch could be removed, that would be very helpful, as gpytorch further depends on linear_operator>=0.1.1
that is a beta project with 17 GitHub stars that doesn't support python 3.7. This is not workable for AG, as we wish to continue supporting Python 3.7.
Once gpytorch
is removed, the only added dependency would be TabPFN itself, which is ideal.
We fixed this, too, now :) The gpytorch
dependency is gone. I will upload this on pip in the coming days (it is on main
though). Thanks to @David-Schnurr
Awesome!!!!! Please let me know when the pip is available, I'd love to try it out
We pushed this to pip. Feel free to give it a try :)
Nice!! Will do
setup.py seems to install more dependencies than should be necessary for this model to function. Would it make sense to instead have a
tabpfn[benchmark]
extra dependencies option akin to tabpfn[baselines]?Currently it is unclear how to clone and run TabPFN isolated from source install without these dependencies.
Perhaps it would be ideal to instead have an entirely separate repository for benchmarking TabPFN so that CatBoost etc. have nothing to do with this repo. This would help a lot in terms of code cleanliness and separation of concerns.