elastic / ems-file-service

Data sources for Elastic Map Service
Other
3 stars 8 forks source link

Elastic Map Service Data Sources

Machine readable and standardized data sources for use in Elastic Map Service.

Usage

Create a new JSON or Hjson file in the appropriate folder in sources. The source file must match the schema in schema/source_schema.json.

To validate data sources against the schema run

yarn test

Setting the environment variable EMS_STRICT_TEST will perform an additional check to ensure all field definitions are present in all features:

EMS_STRICT_TEST=ok yarn test

To build manifests and vector data files for all versions run

yarn build

Continuous Integration and Deployment

Versioning

Adding a new country subdivision vector layer

Whenever possible new vector layers should be created using a SPARQL query in Sophox.

  1. Checkout the upstream feature-layers branch.
  2. If necessary, create a new folder in the sources directory with the corresponding two-digit country code (ex. ru for Russia).
  3. Copy and paste the template source file (templates/source_template.hjson) into the new directory you created in step 1. Give it a useful name (ex. states.hjson, provinces.hjson, etc).
  4. Complete the note and name fields in the new source file.
  5. Copy and paste the query.sparql value into the query box on http://sophox.org.
  6. Change the Q33 in the VALUES ?entity { wd:Q33 } to the corresponding Wikidata ID for the country for which you are adding subdivisions (ex. Q33 is the Wikidata ID for Finland).
  7. Run the SPARQL query and compare the iso_3166_2 results with the corresponding country's subdivision list on the ISO website looking for missing iso_3166_2 codes.
  8. The most common reason for missing iso_3166_2 codes in the query results is an incomplete "contains administrative territorial entity" property in the immediate parent region of the subdivision in Wikidata (usually, but not always, the country). You may need to add the subdivision Wikidata item to this property (ex. https://www.wikidata.org/wiki/Q33#P150).
  9. Add label_* fields for each official language of the country to the SPARQL query similar to the label_en field.
  10. Optionally, add unique subdivision code fields from other sources (ex. logianm in Ireland) to the query.
  11. Run the SPARQL query and check the map output.
  12. Optionally, click the "Simplify" link and drag the slider to reduce the number of vertices (smaller file size).
  13. Click the "Export" link on the top right of the map. Choose GeoJSON or TopoJSON as the File Format.
  14. Type rfc7946 ƒin the "command line options" to reduce the precision of the coordinates and click "Export" to download the vector file.
  15. Rename the downloaded file with the first supported EMS version number (ex. _v1, _v2, _v6.6) and the vector type (geo for GeoJSON, topo for TopoJSON) (ex. russia_states_v1.geo.json). Copy this file to the data directory.
  16. Complete the emsFormats properties: type is either geojson or topojson, file is the filename specified above, default is true when there is only one format. Subsequent formats can be added but only one item in the array can have default: true. The other items must be default: false or omit default entirely.
  17. Copy and paste the SPARQL query from Sophox to the query.sparql field in the source file.
  18. Use the scripts/wikidata-labels.js script to list the humanReadableName languages from Wikidata (e.g. node scripts/wikidata-labels.js Q33). You should spot check these translations as some languages might lack specificity (e.g. Provins rather than Kinas provinser).
  19. We should maintain the current precedent for title casing legacyIds and English labels of the humanReadableName. This may need to be manually edited in the source (e.g. Paraguay Departments).
  20. All fields used by sources that do not follow the label_<language_code> schema must have translations in (schema/fields.hjson). If necessary, use the scripts/wikidata-labels.js script to list translations and copy them to (schema/fields.hjson) (e.g. node scripts/wikidata-labels P5097).
  21. Use the following bash command to generate the timestamp for the createdAt field. Use gdate on Mac OSX. date -u +"%Y-%m-%dT%H:%M:%S.%6N"
  22. Generate a 17 digit number for the id field. A timestamp using the following bash command is suitable. Use gdate On Mac OSX. date +%s%6N
  23. The filename field in the source file should match the name of the file you added to the data directory.
  24. Run yarn test to test for errors.
  25. Invalid or non-simple geometry errors that occur during testing can usually be fixed by running the clean-geom.js script against the GeoJSON file (e.g. node scripts/clean-geom.js data/usa_states_v1.geo.json).
  26. Run ./build.sh to build the manifest and blob files locally.