RepoData / RepoData

Public-facing data for the US Archives RepoData project
17 stars 6 forks source link

Remove openpyxl dependency #26

Open edsu opened 2 years ago

edsu commented 2 years ago

Now that we are treating the data.csv as the primary data source and generating we aren't using Excel anymore. This means we can remove the openpyxl dependency and update bin/check.py and bin/geocode.py to work off the CSV. The instructions in README.md will also need to be adjusted to not mention the need to pip install.

helrond commented 2 years ago

bin/geocode.py is a little confusing because it talks about a source file in Excel format in the docstrings but the path is to a CSV (and my email records indicate that we did in fact use a CSV rather than XLSX) so we'll need to double check with @tanseyem before we move on this.

helrond commented 2 years ago

I'm also wondering if bin/check.py is obsolete - this was explicitly to get all the state spreadsheets aligned so we could merge them into a single CSV, which we've now done.