NarrativeService: Bot Requests

HermannKroll commented 4 months ago

The Drug Overviews are flooded by bot requests at the moment. The problem is that we write all requests of them into our log files.

Can we automatically detected bots? E.g., the user agent name contains the word 'bot'. We could then write the requests to another log file (bot-log-file).

Requests in nginx looks like this.

66.249.79.8 - - [26/Jun/2024:23:50:14 +0200] "GET /query_sub_count?query=Betaine+treats+Disease&data_source=PubMed HTTP/1.1" 200 4359 "https://narrative.pubpharm.de/drug_overview/?drug=Betaine" "Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/125.0.6422.154 Mobile Safari/537.36 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"

HermannKroll commented 3 months ago

Let's improve the logging script. Start a process each night that counts everything and then write the results to a cached json.

HermannKroll commented 2 months ago

Logs are cached each night

HermannKroll commented 2 months ago

We are waiting for respone of our supplier.

HermannKroll / NarrativeIntelligence

NarrativeService: Bot Requests #289