OpenGeoMetadata / GeoCombine

A Ruby toolkit for managing geospatial metadata
https://github.com/OpenGeoMetadata/GeoCombine
Other
20 stars 23 forks source link

Support harvesting OGM records based on an allowed list of repositories #137

Closed thatbudakguy closed 1 year ago

thatbudakguy commented 1 year ago

The current behavior is to download all OGM repositories that aren't on the configured denylist: https://github.com/OpenGeoMetadata/GeoCombine/blob/7a42aaaa709985b7b9fbad70cc1b83b87ef72a5d/lib/geo_combine/harvester.rb#L12-L22

Not sure if other folks do it the same way, but at Stanford we have code to instead only harvest repositories on an allowlist. This avoids accidentally harvesting our own metadata from OGM (and possibly duplicating it) and also ensures that adding new institutional metadata is an intentional process via pull request.

It would be nice to have GeoCombine support this behavior without any additional logic.

thatbudakguy commented 1 year ago

After some thought I don't think this needs to be core GeoCombine behavior. I was able to subclass GeoCombine::Harvester and get the behavior I wanted pretty easily, and that seems good enough.