weecology / retriever

Quickly download, clean up, and install public datasets into a database management system
http://data-retriever.org
Other
306 stars 132 forks source link

Added tidycensus dataset #1606

Closed Aakash3101 closed 3 years ago

Aakash3101 commented 3 years ago

Python script for installing the datasets present in the R package tidycensus using rpy2.

Some other changes that I made after yesterday's meeting, are to check if the raw data files are present or not. If they are present use those files and install them.

Aakash3101 commented 3 years ago

Okay, so the tests are failing because when the scripts are read, it reads the import statements, and it could not import the rpy2 package. So that is the reason why the tests fail. And I have tested that only the import lines are read, the script is actually not executed so we can put the import statements in a try-except block like:

try:
    import rpy2.robjects as ro
    import rpy2.robjects.packages as rpackages
    from rpy2.robjects import pandas2ri
    from rpy2.robjects.conversion import localconverter
except ImportError:
    pass
henrykironde commented 3 years ago

@Aakash3101, let us work on the clean up together, I have got some new designs we can follow.

Aakash3101 commented 3 years ago

Sure

Aakash3101 commented 3 years ago

Installing tidycensus without internet:

Screenshot from 2021-08-10 23-55-46


Installing tidycensus with internet:

Screenshot from 2021-08-10 23-58-54

Aakash3101 commented 3 years ago

@henrykironde could you share your changes in the install_tidycensus function

Aakash3101 commented 3 years ago

@henrykironde I am unable to make changes in the script for it to follow the retriever download tidycensus --path <PATH> command. The raw data is getting saved in the ~/.retriever/raw_data/tidycensus directory and not at the specified PATH.

I think @kkothari2001 would be able to help me here, and he would also achieve his task from this.