EpistasisLab / pmlb

PMLB: A large, curated repository of benchmark datasets for evaluating supervised machine learning algorithms.
https://epistasislab.github.io/pmlb/
MIT License
805 stars 135 forks source link

[Errno 2] No such file or directory #114

Closed taleslimaf closed 4 years ago

taleslimaf commented 4 years ago

I'm trying to use the fetch_data function by passing the local_cache_dir parameter. Despite this, every time I call the function with a path to save the database and not have to download it again, the following error appears: [Errno 2] No such file or directory.

In line 86 of pmlb.py, if the patch does not exist, it cannot create the file in Windows.

My solution: Path(os.path.dirname(dataset_path)).mkdir(parents=True, exist_ok=True)

Besides that, some base names listed in classification_dataset_names are not really among the bases. For example: "cars1"

cfusting commented 4 years ago

It looks like this has been fixed in master but not in pip package 1.0.

trangdata commented 4 years ago

Thank you both -- @taleslimaf for reporting and @cfusting for contributing!

It looks like this has been fixed in master but not in pip package 1.0.

I believe this is correct. This commit should have fixed it. We'll update this pip package this coming week. CC @weixuanfu

Besides that, some base names listed in classification_dataset_names are not really among the bases. For example: "cars1"

We have removed the hardcoded dataset names here, so in this next minor release, classification_dataset_names will only contain the datasets that exist in the repo. (cars1 was removed because it was a duplication of cars https://github.com/EpistasisLab/penn-ml-benchmarks/issues/85).

trangdata commented 4 years ago

I believe this has been now fixed. Please don't hesitate to reopen if the problem persists.