treforevans / uci_datasets

Regression datasets from the UCI repository with standardized test-train splits.
MIT License
36 stars 10 forks source link

Allow dependence on `uci_datasets` via`pip` #2

Closed eringrant closed 2 years ago

eringrant commented 2 years ago

Thanks for creating this package; it's super useful for working with UCI datasets!

I'd like to be able to include this package as a dependency in other projects' setup.py or setup.cfg files via:

uci_datasets @ git+https://github.com/treforevans/uci_datasets

Including the data files as package data in your setup.py file as in this PR enables that, since you are no longer using Git LFS.

The package when zipped ends up being 327.2 MB but I think this is fine for something that's clear to users is a dataset package and is only hosted on GitHub. To get around having the data files in the package itself you could download on first use like PyTorch datasets.