eldy / AWStats

AWStats Log Analyzer project (official sources)
https://www.awstats.org
364 stars 121 forks source link

Hits vs Pages (2) #191

Open visualperception opened 3 years ago

visualperception commented 3 years ago

A couple of previous issues relating to ths same problem are #59 and #137 neither of which have been addressed. Do we have any community members who are perl literate that would be willing to have a go at implementing this? Bllacky has described the issue in #137 I have described it in #59

See below

Thanks,

visualperception (AKA RobC)

137

Hi,

This is a feature request.

Overall I love Awstats for its simplicity. It works very well for most of my needs. However, it has a bit of trouble with bot detection.

I have multiple software to monitor my website's traffic, all using various methods. Awstats is one of them. Overall Awstats is in agreement with the others with one exception. There are lots of bots out there on the internet, that Awstats doesn't detect but which are obvious if you look at the List of Hosts.

Why are these bots obvious, because most modern websites, including mine, have multiple files loaded upon a visit, you have a bunch of CSS files, js files, and so on. So if you see a visit which is 1 page, 1 hit and 10KB in traffic, you know that's not a real visit and probably a bot. If I eliminate these visits from those counted by Awstats, then Awstats statistics are in agreement with those of Google Analytics or Matomo.

So my request is to allow me to set in the config files some parameters based on which AWstats should count what is a real visit and what is a bot.

Example:

MinHitsPerVisit=7; (default=1) //Minimum number of Hits required to consider a visit real (human) and not a bot.

MinHitsToPageRatio=1.5; (default=1) //Minimum ratio between hits and pages in a visit to consider a visit real (human) and not a bot.

MinVisitTraffic=100KB; (default=1KB) //Minimum traffic of a visit to consider that visit real (human) and not a bot.

Implement these 3 parameters and I can make my Awstats come in agreement with other traffic measuring software.

Thank you very much!

59

I have noticed an increasing number of bad robots that don't identify themselves as robots. Typically they will fetch a sites root/home page html file and noting else. This can be seen in the Hosts (IP) report where you can see 1 page and 1 hit or 2 pages and 2 hits etc against a Host/IP. Checking the raw log files confirms this and that the user-agent has no bot indentification in it. Unfortunately these bad bots are added to the unique visitors count when they should infact be added to unidentified robots count. I appreciate this is tricky to catch in awstats especially since it could be a visitor coming back and most of the hit files are already in the users browser cache. However its pretty obvious these visits are not real visitors and the volume of them and their regular visits is very large. Something like 30% of visitors on one site I look after and similar on a couple of others. This completely distorts the stats giving the impression of far more real visitors than there actually are.

Is there anyway to modify awstats to incoporate a configurable conf file option to say a page must have x amount of hits on it to be considered a real visitor otherwise its an unidentified robot? Most pages these days will have at least half a dozen or more file hits on them so the data is already in awstats program. How easy that is to implement may be another matter, I don't know.

Bllacky commented 3 years ago

I second this.

visualperception commented 3 years ago

Anyone else willing to look at fixing this inaccuracy in "Unique Vistors" ?