geoblacklight / geoblacklight-schema

DEPRECATED: (See GeoBlacklight repo) A metadata schema for GIS resource discovery used by GeoBlacklight
http://github.com/geoblacklight/geoblacklight
Other
15 stars 4 forks source link

Expand formats available in dc_format #84

Closed eliotjordan closed 7 years ago

eliotjordan commented 8 years ago

I'm working on creating geoblacklight documents from hydra/geo_concerns works and came across a limitation with the current geoblacklight schema. The dc_format field is currently limited to "Shapefile", "GeoTIFF", and "ArcGRID", but GeoConcerns supports a far larger set of formats. To fix this, I propose that we expand the list of formats:

"dc_format_s": { 
    "type": "string",
    "description": "File format for the layer, from gdal and ogr file formats.",
    "enum": ["FileGDB", "ESRI Shapefile", "PGeo", "GeoJSON", "USGSDEM", "GTiff",...]
}
andrewbattista commented 8 years ago

I agree, Eliot. I thought that we could just add more values to this. I know we have done this in our own local instance of GeoBlacklight. We recently added ESRI Geodatabase (https://geo.nyu.edu/?f%5Bdc_format_s%5D%5B%5D=ESRI+Geodatabase&f%5Bdct_provenance_s%5D%5B%5D=NYU) for instance. But I like the suggestion of using the GDAL and OGR file format list as a benchmark for the controlled vocabulary.

eliotjordan commented 8 years ago

That's what we're doing in GeoConcerns. Finding a comprehensive list of geospatial file formats is hard. GDAL and OGR are the most comprehensive and widely used tools for working with geodata. While not perfect or totally consistent, I'd argue that the library's list of formats is a de facto standard.

mejackreed commented 8 years ago

+1 .. We currently use the locales to translate to a preferred label. Most likely we need to figure out something else there to support this. https://github.com/geoblacklight/geoblacklight/blob/master/config/locales/geoblacklight.en.yml#L16-L21

krdyke commented 8 years ago

+1000 We are dealing with a ton of formats for the CIC project (ArcInfo Coverages, ASCII Grids, ERDAS img, File GDBs, KML) etc. and I've been putting off dealing with the problem.

mejackreed commented 8 years ago

@krdyke It would be great to start a list of the formats that you need. That way we can look at starting to support these.

drh-stanford commented 8 years ago

@eliotjordan do we have the extracted GDAL/OGR file format list somewhere?

eliotjordan commented 8 years ago

@drh-stanford I don't think we do. We just referred to the gdal manual pages. This seems like a great RDF property and value set to talk about with the Hydra URI Management Working Group.

drh-stanford commented 8 years ago

I guess the issue is whether to use their code value or human-readable one. If it's going to be a facet value the latter is probably preferable.

See http://gdal.org/formats_list.html http://gdal.org/ogr_formats.html

drh-stanford commented 8 years ago

See https://github.com/geoblacklight/geoblacklight/issues/429

drh-stanford commented 8 years ago

See https://github.com/geoblacklight/geoblacklight/pull/477 -- this PR removes the enum to enforce a controlled vocab during validation.

andrewbattista commented 7 years ago

To me, this is the same discussion as https://github.com/geoblacklight/geoblacklight/issues/429 - Going to close this and move the discussion over there