getsentry / sentry

Developer-first error tracking and performance monitoring
https://sentry.io
Other
39.38k stars 4.22k forks source link

Facebook crawler ignored #20656

Open 3zzy opened 4 years ago

3zzy commented 4 years ago

Although we have Filter out known web crawlers enabled but the facebook crawler still gets through.

https://github.com/vuejs/vue/issues/10049#issuecomment-527724950

More info: https://github.com/aFarkas/lazysizes/issues/520#issuecomment-505269733 https://www.glowmetrics.com/blog/what-is-fbclid-how-to-remove-fbclid-parameter-from-google-analytics/#gref https://medium.com/@wrel/what-is-the-fbclid-parameter-7f54d890eaea

3zzy commented 4 years ago

Currently we are filtering FB crawler visits by IP.

To get a current list of IP addresses the crawler uses, run the following command: whois -h whois.radb.net -- '-i origin AS32934' | grep ^route

Source: https://developers.facebook.com/docs/sharing/webmasters/crawler/

Then just add those IPs and CIDR range in your project inbound filter. I wish Sentry did this for us with a simple toggle - "Ignore Facebook Crawler - On/Off" instead of us having to manually update those IPs occassionally.

github-actions[bot] commented 3 years ago

This issue has gone three weeks without activity. In another week, I will close it.

But! If you comment or otherwise update it, I will reset the clock, and if you label it Status: Accepted, I will leave it alone ... forever!


"A weed is but an unloved flower." ― Ella Wheeler Wilcox 🥀

riki137 commented 3 years ago

This is still a problem. We're getting several misleading errors because of it.

BYK commented 3 years ago

@riki137 thanks for the thumbs up. This is not on our priority list so cannot promise a timeline on when this will be resolved.

If you'd like to give it a try, we can assist you with a PR though.

shellmayr commented 11 hours ago

Creating an ingest bug out of this - there is an existing rule for facebook crawlers but there seem to have been reports of this not working after it was created (the rule is from 2019) - this should be looked into and closed if the buggy behaviour no longer persists.