ukmda / ukmda-dataprocessing

UK Meteor data analysis code and libraries
https://archive.ukmeteors.co.uk
GNU General Public License v3.0
2 stars 1 forks source link

Check website logs for bot activity #255

Open markmac99 opened 1 year ago

markmac99 commented 1 year ago

Some meteor reporting sites are reporting bot activity. Check the logs to see if there's any evidence.

As for what to do about it. Thats less clear.

markmac99 commented 1 year ago

I've enabled cloudfront logging, Will check in a week.

markmac99 commented 1 year ago

Create an Amazon Athena table for the data as explained here https://docs.aws.amazon.com/athena/latest/ug/cloudfront-logs.html Then do something like select distinct request_ip, referrer, user_agent from cloudfront_logs WHERE "date" BETWEEN DATE '2023-08-19' AND DATE '2023-08-20' and referrer not like 'https://archive.ukmeteornetwork.co.%';

markmac99 commented 1 year ago

Added a robots.txt file though i doubt bytedance will honour it.

markmac99 commented 1 year ago

Lots of scanning by bots googlebot bingbot.com bytedance.com (tiktok) semrush.com yandex.com petalsearch.com coccoc.com