Not sure if other folks do it the same way, but at Stanford we have code to instead only harvest repositories on an allowlist. This avoids accidentally harvesting our own metadata from OGM (and possibly duplicating it) and also ensures that adding new institutional metadata is an intentional process via pull request.
It would be nice to have GeoCombine support this behavior without any additional logic.
After some thought I don't think this needs to be core GeoCombine behavior. I was able to subclass GeoCombine::Harvester and get the behavior I wanted pretty easily, and that seems good enough.
The current behavior is to download all OGM repositories that aren't on the configured denylist: https://github.com/OpenGeoMetadata/GeoCombine/blob/7a42aaaa709985b7b9fbad70cc1b83b87ef72a5d/lib/geo_combine/harvester.rb#L12-L22
Not sure if other folks do it the same way, but at Stanford we have code to instead only harvest repositories on an allowlist. This avoids accidentally harvesting our own metadata from OGM (and possibly duplicating it) and also ensures that adding new institutional metadata is an intentional process via pull request.
It would be nice to have GeoCombine support this behavior without any additional logic.