edgi-govdata-archiving / ECHO-by-Zip-Code

A Jupyter Notebook-based exploration of emissions permits, compliance, and enforcement designed for localization by zip code
https://colab.research.google.com/github/edgi-govdata-archiving/ECHO-by-Zip-Code/blob/main/echo-by-zip.ipynb
GNU General Public License v3.0
1 stars 2 forks source link

Where to keep the data? #2

Closed Frijol closed 4 years ago

Frijol commented 4 years ago

Currently, the repo is set up to run locally & pull from a data folder that is not checked in (as part of the .gitignore configuration).

For https://github.com/edgi-govdata-archiving/ECHO-by-Zip-Code/issues/1, the smoothest run for report generation would involve hosting the CSV file on the internet. It's downloadable as ZIP from ECHO; it's too big (nearly 1.5GB) to check in to the repo.

This blog post outlines three ways to load CSV files into Colab. The simplest is by linking to a url for a raw CSV. Is this something that makes sense for us to do, perhaps via @qri-io (I want to archive it there anyway)?

Frijol commented 4 years ago

Going to try git-lfs on a separate repo

Frijol commented 4 years ago

It's now stored in https://github.com/edgi-govdata-archiving/echo-data ! Upload took so long I wasn't sure it was going to work but now it's online & I'm testing it in the notebook.