NSAPH-Projects / space

SpaCE, the Spatial Confounding Environment, loads benchmark datasets for causal inference methods tackling spatial confounding
https://nsaph-projects.github.io/space/
MIT License
12 stars 4 forks source link

No such file or directory error #146

Closed zcalhoun closed 7 months ago

zcalhoun commented 7 months ago

First off - thank you for putting together this repository. I am looking forward to seeing this project succeed.

I had a question about loading the data -- when I tried to follow your instructions under "Getting Started," I get an error "No such file or directory [...] /healthd_dmgrcs_mortality_disc/synthetic_data.csv." I looked at the loaded data, and sure enough, the data was loaded, but it looks like the synthetic data is a .tab file. This seemed like an error, and I was wondering whether I could still trust the generated data?

mauriciogtec commented 7 months ago

We are looking into it. Thanks for raising an issue! For background, are you using the Github code or the PyPI package?

zcalhoun commented 7 months ago

I tried using both the PyPI package and the Github code.

mauriciogtec commented 7 months ago

@zcalhoun The PyPI version needs to be updated since it has the error but we don't see an error when installing directly from the repository:

pip install "git+https://github.com/NSAPH-Projects/space@dev#egg=spacebench[all]"

We will update the PyPI version, in the meantime, can you uninstall your current version (pip uninstall spacebench) and install from Github as above? If the problem persists, please let us know and include a code sample and information about your system so we can reproduce the error.

mauriciogtec commented 7 months ago

A new version in PyPI v0.1.4 has been released.

zcalhoun commented 7 months ago

Hi Mauricio,

Thank you for following up. I am still seeing the issue when I try to load the datasets. I think the issue is with this line of code: https://github.com/NSAPH-Projects/space/blob/ad1630038abee4fdab3080f25fba30bd4a4df85c/spacebench/env.py#L296

Basically, you should replace the 1: with -1, so that you are always getting the file type. This error stems from the fact that I was using a relative reference with ./, which doesn't work with the way you've written this line of code. I can make the change and issue a pull request if you'd like.

mauriciogtec commented 7 months ago

I can make the change and issue a pull request if you'd like.

@zcalhoun That would be fantastic! We have to be careful of the case when the extension is .tar.gz, in which [-1] alone wouldn't work. Maybe we can use .endswith() and define ext inside the if-else statement?

zcalhoun commented 7 months ago

I see what you mean in the code -- there are two separate lines of code that use similar logic to load the .csv/.tab/.parquet files, and then a separate set of code to open up graphml/tar.gz files. I made updates to both that should address the issue I encountered (see PR #151).