sul-dlss / gis-robot-suite

Robots for GIS accessioning and delivery
Other
9 stars 4 forks source link

Only convert `index_map.json` files to `.geojson` #848

Closed edsu closed 9 months ago

edsu commented 9 months ago

We are currently converting any .json file to use the .geojson file extension. While reviewing the filenames used in GIS items I noticed that apart from one typo the only JSON filename that is used is index_map.json. In order to allow other types of JSON files to be present in GIS items we should only assume that index_map.json is a GeoJSON file.

filename count
data.zip 22397
data_EPSG_4326.zip 22397
preview.jpg 22397
index_map.json 147
Beechey_WGS.tif.xml 1
Beechey_WGS-iso19139.xml 1
Beechey_WGS-fgdc.xml 1
bathy20.txt 1
Government_Buildings.xls 1
hab2_90.dbf 1
rip2_90.dbf 1
51007.tif.xml 1
51007-iso19139.xml 1
51007-fgdc.xml 1
rip1_90.dbf 1
MCE_FJ2G_2017.tif.xml 1
MCE_FJ2G_2017-iso19139.xml 1
MCE_FJ2G_2017-fgdc.xml 1
MCE_FJ3G_2019.tif.xml 1
MCE_FJ3G_2019-iso19139.xml 1
MCE_FJ3G_2019-fgdc.xml 1
RR_ARMSTRNG.dbf 1
CadstralParcels_20230706.shp.xml 1
CadstralParcels_20230706-iso19139.xml 1
CadstralParcels_20230706-iso19110.xml 1
indexmap.json 1
hab2_42.dbf 1
hab3_42.dbf 1
hab3_90.dbf 1
rip3_90.dbf 1
lwrubel commented 9 months ago

@thatbudakguy advised converting all .json to .geojson, not just index_map.json. Is there any further nuance to this, @thatbudakguy?

thatbudakguy commented 9 months ago

To my knowledge, the only reason we ever ingest JSON through our GIS pipeline is as geospatial data – that is, as geoJSON. The table above seems to bear that out. I think it's safe to rename all JSON to have the .geojson extension.

The alternatives are:

edsu commented 9 months ago

It appears that PURL only displays index_map.json files, and that EarthWorks doesn't do anything with them at the moment?

For example:

While we don't have any right now, JSON is a popular data format, and I could imagine non-GeoJSON files being accessioned in the future as part of a GIS dataset. Would we really want to assume that those are GeoJSON, and rewrite the filenames?

Absent a way of reliably identifying GeoJSON files, I do like the idea of accessioneers naming known GeoJSON files with the .geojson extension going forward.

PS. I created https://github.com/sul-dlss/sul-embed/issues/2118 to track the fact that sul-embed needs to be updated to display the GeoJSON with the new extension.

edsu commented 9 months ago

But as you say @thatbudakguy if we never imagine adding GIS items with other types of supporting JSON in them, then this can be closed.