amplitude / Amplitude-JavaScript

JavaScript SDK for Amplitude
MIT License
315 stars 132 forks source link

Petalbot should be ignored user agent #571

Closed jacobsimon closed 1 year ago

jacobsimon commented 1 year ago

Expected Behavior

Amplitude client should filter out known web crawlers and user agents from sending events

Current Behavior

I'm seeing a large volume of requests apparently from PetalBot in our events dashboards. The user agent, confusingly, comes through as Mobile Safari on an Android device. But when I did a reverse IP lookup I saw it traced back to a Singaporean 114.119.xxx IP range that belonged to petalsearch.com. We did not see a corresponding increase in events on Google Analytics, which we also use.

Screen Shot 2023-01-05 at 4 53 09 PM

Possible Solution

The client library (or Amplitude's servers) should exclude these bots from sending events, either by checking the user agent or IP?

Related issue for Mixpanel found here: https://github.com/mixpanel/mixpanel-js/issues/316

Steps to Reproduce

Unknown - N/A

Environment

liuyang1520 commented 1 year ago

Hi @jacobsimon ,

Thanks for supporting Amplitude! I brought this to the team so see how we can help with the bot traffic issue. Will share here for any updates.

liuyang1520 commented 1 year ago

Hi @jacobsimon ,

After sharing this to team, I don't think we have plan to block the bot traffic in the client side at the moment. But our backend ingestion host does provide the feature to block events with ip list, user agent list, even device id, user id, etc. This configuration currently is not exposed to public access, if you want to configure this, please contact the customer support to help with this.

Thanks for supporting Amplitude!