Open luminoso opened 7 years ago
I appear to be having the same issue on 7.4. In /var/lib/awstats/awstats032018.cryptfolio.com.txt
, the errors are counted:
# Errors - Hits - Bandwidth
BEGIN_ERRORS 11
301 762 2950772
404 716 3025993 # <------------ 716 counts of 404 errors
206 12 347942
302 228 320151
400 175 188388
503 48 162690
500 17 20253
422 53 151111
403 2 8140
502 1 914
401 4 17769
END_ERRORS
But later on:
BEGIN_SIDER_404 6 # <----------- but it's showing only six rows
/assets/bundle-0f112d97ccfac5359e895ff4ce99ec70ef000f67.js 1 https://preview.cryptfolio.com/currencies/grid
/assets/application-5a4ee7f00194b83b1dfaa3eb3b5410fc8adb98a2c3aeac5a82bf8007073eaa07.js 1 https://preview.cryptfolio.com/currencies/grid
/assets/application-0f112d97ccfac5359e895ff4ce99ec70ef000f67.css 1 https://preview.cryptfolio.com/currencies/grid
/wp-login.php 3 -
/assets/secure-0f112d97ccfac5359e895ff4ce99ec70ef000f67.css 1 https://preview.cryptfolio.com/currencies/grid
/sidekiq/images/status.png 2 https://cryptfolio.com/sidekiq/stylesheets/application.css
END_SIDER_404
This is reflected in the generated HTML.
I've checked that the apache logs are in the correct format. Here are two requests in access.log
, the second a 404 missing:
101.98.118.234 - - [02/Mar/2018:01:46:49 +0000] "GET / HTTP/1.1" 200 7815 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:58.0) Gecko/20100101 Firefox/58.0"
101.98.118.234 - - [02/Mar/2018:01:47:33 +0000] "GET /missing HTTP/1.1" 404 2495 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:58.0) Gecko/20100101 Firefox/58.0"
So they're the same format. I can't think of why this could be occurring except for a bug in the way new 404 records are stored.
I tried requesting the missing /sidekiq/images/status.png
above, and then regenerating the awstats logs - but it wasn't picked up. So it feels like it's not picking up, or saving, new 404 log entries, even if the previous 404 URL exists.
The commands I'm using to generate awstats:
/usr/lib/cgi-bin/awstats.pl -config=cryptfolio.com -update
/usr/share/awstats/tools/awstats_buildstaticpages.pl -config=cryptfolio.com -dir=/var/www/preview.cryptfolio.com/awstats/ -awstatsprog=/usr/lib/cgi-bin/awstats.pl -showcorrupted
Happy to share some more log files if necessary. On Ubuntu 16.04 LTS.
The cause is likely something specific to your site and traffic you're getting, like a rare request that doesn't happen a lot. I don't see this on my site (running awstats 7.7).
If I would have to guess, it's some request that occurs unrelated to the 404s and corrupts the state. I've seen this before (e.g. in #15 and #63 awstats corrupts its internal state when seeing 206 requests, although not permanently like in this case).
The BEGIN_ERRORS
and BEGIN_SIDER_404
sections above look OK to me.
By the way, the relevant code that counts 404s starts below this line:
I note that you also have http code 400 errors and that is wrong too. I wonder if there is some problem having two counts for series 4 error codes which are producing link pages to view them.
My site is fine but I haven't seen any http 400 errors this year or last year.
You probably don't see 400 errors (bad request) because awstats drops them as corrupted. For instance, my Apache logs a request that results in a 400 error with a line like this:
0.0.0.0 - - [02/Mar/2018:17:10:14 +0100] "" 400 284 404 - "-" "-" default 443
I think awstats fails to parse that because the 5th field doesn't contain a HTTP method, path and version.
Hello,
My system is encountering the same problem. It's a simple setup, configured on April 2, 2018, with the latest AWStats (7.7). When I set it up, I had just created new log files (to split two websites that until then had been logging to the same file). So the database starts only with hits that were made that day.
On my busier site, there were no 404 hits between the time I created the new log and created the first AWStats database. I noticed this morning (April 13) I have 76 hits on 404 in the counter, but on the detail page there's nothing. I can see these 404 hits in the logs. Notice it seems to apply to the 400 hits too: no detail being recorded.
# Errors - Hits - Bandwidth
BEGIN_ERRORS 7
302 21 8
206 14 112099
303 41 0
400 10 2170
404 76 16539
410 17 115510
301 187 52203
END_ERRORS
# URL with 403 errors - Hits - Last URL referrer
BEGIN_SIDER_403 0
END_SIDER_403
# URL with 400 errors - Hits - Last URL referrer
BEGIN_SIDER_400 0
END_SIDER_400
# URL with 404 errors - Hits - Last URL referrer
BEGIN_SIDER_404 0
END_SIDER_404
In the log file, there are entries such as this:
46.119.112.226 - - [11/Apr/2018:11:53:26 -0700] "POST /resource-center/glossary/general-terms/trackback/ HTTP/1.1" 404 247 "-" "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3251.21 Safari/537.36"
46.161.55.106 - - [13/Apr/2018:01:00:00 -0700] "GET /_asterisk/ HTTP/1.1" 404 208 "-" "python-requests/2.18.4"
Here an example of a 400 entry:
168.1.128.34 - - [13/Apr/2018:00:39:21 -0700] "GET / HTTP/1.1" 400 226 "-" "-"
On my second site, there are six entries in the details, and I confirmed that these all came in on April 2, and were present when I did the first database build. Everything else after didn't make it.
I have some 206 and 400 entries in one site, and 400 in the other.
Jeffrey Fox
Problem is this section of code (starting at 5938),
if ($withpurge) {
%_sider_h = ();
%_referer_h = ();
%_err_host_h = ();
}
That will purge new data from all error codes, not only from the current one (e.g $_sider_h{$code} )
This indeed seems to be correct. It explains why only some people are encountering this bug. It only manifests itself if you have TrapInfosForHTTPErrorCodes
set in the config file, and you list more than one error code there.
Due to the bug, data that eventually renders into the 404 detail page only gets updated for one HTTP error code (except on the first parse of the log file in a month, when a new data file gets created). By default, awstats only saves details for 404, so everything works correctly. But if you have TrapInfosForHTTPErrorCodes
, then it might happen that some other error code gets saved first, and 404 errors are lost.
I'm currently running awstats 7.6 (7.6-3.1.el7) in a CentOS 7.3.1611 machine that parses nginx (1:1.10.2-1.el7) logs.
The problem that' I'm having is that after the first initial parse of the logs the 404 hits detail page (example) the hit counter stops updating.
The AWStats summary front page continues counting it right. It's just the details that stops updating.
A quick workaround is to delete /var/lib/awstats directory contents and let awstats re-parse everything. However, it only works for the first time.
I suspected that it may be related to SELinux, but setting it to permissive doesn't help. No errors are shown when updating manually.
Any ideas what it could be or how can I debug a little more?