aiddata / gcdf-geospatial-data

Repository for AidData's Geospatial Global Chinese Development Finance Dataset (GeoGCDF)
https://aiddata.org/china
Other
27 stars 7 forks source link

Finalize data structure #6

Closed sgoodm closed 3 years ago

sgoodm commented 3 years ago

Finalize how the repo and data are structure

Currently the latest directory holds a geojson subdirectory for all individual project GeoJSONs (labeled based on TUFF ID, with a single GeoJSON per project), and a combined.geojson.zip which contains all project features in a single GeoJSON.

Major issue is that only 1000 files are viewable within a single directory on GitHub. A possible solution is to split up the geojsons directory into subdirectories based on geographic regions, country, or some other variable contained in the raw data.

Depending on how GeoJSONs are split/grouped, this may be a temporary solution as new data is added.

Notes:

sgoodm commented 3 years ago

Also consider renaming combined.geojson.zip to global.geojson.zip or all_projects.geojson.zip

sgoodm commented 3 years ago

It seems unlikely people will be browsing on GitHub to find GeoJSONs since they can either click the full link the main dataset, or rebuild the link using the TUFF ID. Therefore we are going to keep the file structure simple and just use the current single directory.

Keeping this open for ongoing data structure discussions prior to launch

sgoodm commented 3 years ago

Now have multiple geojson zip files for all projects and projects based on finance type (official release is breaking these up into different datasets / spreadsheet tabs).

GeoJSONs will all remain in a combined geojson folder.