ProjectSidewalk / SidewalkWebpage

Project Sidewalk web page
http://projectsidewalk.org
MIT License
84 stars 25 forks source link

Consider filtering out logs from Uptime Robot #3742

Open misaugstad opened 3 days ago

misaugstad commented 3 days ago
Brief description of problem/feature

We use a service called Uptime Robot to help notify us if a server has gone down. It pings every server once every 5 minutes by trying to access the /signIn page. The downside is that we now have hundreds of thousands of entries in our webpage_activity table, one for every one of those pings, of the form Visit_SignIn. We have the space for it, but it makes things a bit messy when I want to look at the logs to help with an authentication issue for someone, for example.

At one point I set it up to manually filter out logs from a specific IP address where most of the data came from (in the save() function in WebpageActivityTable.scala). But that wasn't actually all of the IPs that were used, and Uptime Robot made some infrastructure changes in October, which switched out their list of IPs that they use.

Potential solution(s)

We could consider storing those IPs in a config file or even just in the WebpageActivity.scala` file to filter them out. Below is the page with the current list of IPs that are used: https://uptimerobot.com/help/locations/?utm_source=newsletter&utm_medium=email&utm_campaign=core-free

Then we could delete the large pile of Visit_SignIn entries from those IPs from the dbs. And hopefully find the old list of IPs and remove those entries as well.