Do not call third party services if known crawler

gbif / portal16

GBIF.org website

https://www.gbif.org

Apache License 2.0

24 stars 15 forks source link

Do not call third party services if known crawler #1944

Closed MortenHofft closed 4 months ago

MortenHofft commented 4 months ago

We integrate with other services to get e.g. cites status. When we are being crawled by services that actually render the client side code, then many of these called are triggered and will at times bring down our partner services.

We should try to prevent that. One approach could be to detect spiders that identify themself. And then just not contact external services in those cases.

MortenHofft commented 4 months ago

I've added above logic for cites, iucn, openTreeOfLife and wikidata. How well it works will require real world testing