Esri / geoportal-server

Geoportal Server is a standards-based, open source product that enables discovery and use of geospatial resources including data and services.
https://gptogc.esri.com/geoportal
Apache License 2.0
244 stars 149 forks source link

OAI-PMH Harvesting #258

Open jacimize opened 7 years ago

jacimize commented 7 years ago

I am trying to register a resource on my geoportal instance using the OAI protocol type. I have not done this before and I was wondering if there were any pointers? I am not seeming to have success. Host URL: http://repository.library.noaa.gov/fedora/oai?verb=ListRecords&metadataPrefix=oai_dc Prefix: oai_dc

jacimize commented 7 years ago

any updates on this issue or how we may resolve it? We made sure to add the OAI schema but still no luck.

mhogeweg commented 7 years ago

hi, in this case we suspect there is a bug related to the handling of the robots.txt file. We implemented this to respect the source's wishes to limit what crawlers access. The respecting of robots.txt is a goodwill gesture from the crawler.

In this case we suspect something is going wrong when parsing the file. You can however harvest the site (we tested this internally) by turning off the setting to respect robots.txt. You can change this in the harvest registration page.

jacimize commented 7 years ago

Thanks, will test that ASAP.

simongis commented 5 years ago

Can someone confirm if Geoportal Server supports OAI-PMH harvesting?

mhogeweg commented 5 years ago

yes, geoportal supports OAI-PMH harvesting. this is the case for both Geoportal Server 1.x and 2.x

simongis commented 5 years ago

Thanks @mhogeweg