aiddata / gcdf-geospatial-data

Repository for AidData's Geospatial Global Chinese Development Finance Dataset (GeoGCDF)
https://aiddata.org/china
Other
31 stars 8 forks source link

Can we add source data to repo? #9

Closed sgoodm closed 3 years ago

sgoodm commented 3 years ago

Currently the raw source data is not included in the repo. Ideally we could add this so that the build is full replicable, but we need to make sure there are no issues releasing that raw data to the public.

If there are issues with sharing the full source data, I think we should still include a limited version with key columns (i.e., TUFF ID, text field with OSM links, and anything else used by build).

sgoodm commented 3 years ago

Current discussions suggest we will add the full source dataset which will also serve as the main download point for the data.

Since the original export has been an Excel doc (.xlsx) with multiple sheets we may end up uploading the Excel doc and a CSV of the main sheet containing project data, or updating the code to read a set sheet from Excel file.

sgoodm commented 3 years ago

input data is now included in repo as xlsx sheets read directly via processing