Closed valentinedwv closed 5 years ago
Pull request #101 attempts to address this issue. It reads "Retry-After" response header and applies given delay. That, of course, will cause harvester appear to be slow.
I think, 429 error code has been invented to let servers protect themselves against DDoS type of attacks (or unwanted web crawlers). Hence, a desired solution would be a "white list" kind of mechanism, where harvester IP is being listed as "white" on the server side as a part of some agreement or partnership and is allowed for an unlimited access to the resources, while all the rest would only get a sneak peek of the content.
Pondering how to handle this in the codebase: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/429 A watched session will not trigger the rate limited.
Not quite correct implementation, so 503 might also be trapped: http://www.openarchives.org/OAI/2.0/guidelines-repository.htm#FlowControlAndLoadBalancing
endpoint: https://ws.pangaea.de/oai/provider
at about 60 records, we get a Too Many Requests.. http 429