darold / squidanalyzer

Squid Analyzer parses Squid proxy access log and reports general statistics about hits, bytes, users, networks, top URLs, and top second level domains. Statistic reports are oriented toward user and bandwidth control.
http://squidanalyzer.darold.net/
126 stars 36 forks source link

SquidAnalyzer fails to update statistics after cleanup of access.log #105

Closed mkhallaf closed 8 years ago

mkhallaf commented 9 years ago

SquidAnalyzer fails to update statistics after cleanup of access.log. Any workaround? I can't keep storing 6+ GBytes access.log and I will never use a databse of any sort. I like access.log and I will keep using it.

mkhallaf commented 9 years ago

I figured out: You must go inside year folder, remove all .dat files, then use -r option to rebuild. This must be mentioned in the documentation. It was a great nuisance and disappointment to me to have stats not updating after rotating access.log.

Otherwise, the -r must do that itself.

darold commented 9 years ago

I do not understand what issue you are facing, can you explain how you are using SquidAnalyzer? What is your crontab entry for SquidAnalyzer and what do you call "after cleanup of access.log"? You don't have to remove .dat file or use --rebuild option in normal use, I think you are missing something in your SquidAnalyzer use. Please explain exactly what is the problem.

Best regards,

mkhallaf commented 9 years ago

Thank you for your response. I just empty access.log. Bring it down to zero bytes. No matter how I do it, by 'echo > access.log', by deleting/recreating empty file, whether or not squid was stopped while I do it (I know you might think of broken file start) I always end up with SquidAnalyzer failing to update stats. It always output zeros in debug messages and adds nothing to stats.

Hope that helps.

darold commented 9 years ago

Could you post here the first line of the acces.log file and the content of the SquidAnalyzer.current history file?

mkhallaf commented 9 years ago

I couldn't attach access.log in plain text for some reason, so I attached a snapshot from Notepad++ with all symbols shown:

access log

For SquidAnalyzer.current:

1444608479.265 23521971

Hope that helps.

mkhallaf commented 9 years ago

Just re-stating:

darold commented 9 years ago

Please execute squid-analyzer in debug mode (-d) again and post here the output, but please do not remove any .dat files before.

mkhallaf commented 9 years ago

/.../squid-analyzer -d

HISTORY TIME: Mon Oct 12 02:07:59 2015 - HISTORY OFFSET: 23521971 Starting to parse logfile access.log. Reading file access.log from offset 23521971 to end. Appending data into /..TRIMMED../2015/10/12 Clearing statistics storage hashes, for 2015-10-12 02:00:00. Appending data into /..TRIMMED../2015/10/12 Clearing statistics storage hashes, for 2015-10-12 03:00:00. Appending data into /..TRIMMED../2015/10/12 Clearing statistics storage hashes, for 2015-10-12 04:00:00. Appending data into /..TRIMMED../2015/10/12 Clearing statistics storage hashes, for 2015-10-12 05:00:00. Appending data into /..TRIMMED../2015/10/12 Clearing statistics storage hashes, for 2015-10-12 06:00:00. Appending data into /..TRIMMED../2015/10/12 Clearing statistics storage hashes, for 2015-10-12 07:00:00. Appending data into /..TRIMMED../2015/10/12 Clearing statistics storage hashes, for 2015-10-12 08:00:00. Appending data into /..TRIMMED../2015/10/12 Clearing statistics storage hashes, for 2015-10-12 09:00:00. Appending data into /..TRIMMED../2015/10/12 Clearing statistics storage hashes, for 2015-10-12 10:00:00. Appending data into /..TRIMMED../2015/10/12 Clearing statistics storage hashes, for 2015-10-12 11:00:00. Appending data into /..TRIMMED../2015/10/12 END TIME : Mon Oct 12 12:20:58 2015 Read 539894 lines, matched 392128 and found 392128 new lines Reordering daily data files now... Saving data into /..TRIMMED../2015/10/01 Saving data into /..TRIMMED../2015/10/02 Saving data into /..TRIMMED../2015/10/03 Saving data into /..TRIMMED../2015/10/04 Saving data into /..TRIMMED../2015/10/05 Saving data into /..TRIMMED../2015/10/06 Saving data into /..TRIMMED../2015/10/07 Saving data into /..TRIMMED../2015/10/08 Saving data into /..TRIMMED../2015/10/09 Saving data into /..TRIMMED../2015/10/10 Saving data into /..TRIMMED../2015/10/11 Saving data into /..TRIMMED../2015/10/12 Generating weekly data files now... Compute and dump weekly statistics for week 42 on 2015 Saving data into /..TRIMMED../2015/week42 Generating monthly data files now... Compute and dump month statistics for 2015/10 Saving data into /..TRIMMED../2015/10 Generating yearly data files now... Compute and dump year statistics for 2015 Saving data into /..TRIMMED../2015 DEBUG: the log statistics gathering took:31 wallclock secs (26.87 usr + 0.71 sys = 27.58 CPU) Building HTML output into /..TRIMMED.. Generating statistics for day 2015-10-12 User statistics in /..TRIMMED../2015/10/12... Mime type statistics in /..TRIMMED../2015/10/12... Network statistics in /..TRIMMED../2015/10/12... Top URL statistics in /..TRIMMED../2015/10/12... Top domain statistics in /..TRIMMED../2015/10/12... Cache statistics in /..TRIMMED../2015/10/12... Generating statistics for month 2015-10 User statistics in /..TRIMMED../2015/10... Mime type statistics in /..TRIMMED../2015/10... Network statistics in /..TRIMMED../2015/10... Top URL statistics in /..TRIMMED../2015/10... Top domain statistics in /..TRIMMED../2015/10... Cache statistics in /..TRIMMED../2015/10... Generating statistics for week 42 on year 2015 User statistics in /..TRIMMED../2015/week42... Mime type statistics in /..TRIMMED../2015/week42... Network statistics in /..TRIMMED../2015/week42... Top URL statistics in /..TRIMMED../2015/week42... Top domain statistics in /..TRIMMED../2015/week42... Cache statistics in /..TRIMMED../2015/week42... Generating statistics for year 2015 User statistics in /..TRIMMED../2015... Mime type statistics in /..TRIMMED../2015... Network statistics in /..TRIMMED../2015... Top URL statistics in /..TRIMMED../2015... Top domain statistics in /..TRIMMED../2015... Cache statistics in /..TRIMMED../2015... DEBUG: generating HTML output took:43 wallclock secs (37.27 usr + 1.68 sys = 38.95 CPU) DEBUG: total execution time:74 wallclock secs (64.14 usr + 2.39 sys = 66.53 CPU)

darold commented 9 years ago

ok, it works as expected, you might have statistics updated. So now if you perform an "echo > access.log" the access.log file will be empty, wait some time until there is some new entries logged by squid and run again squid-analyzer in debug mode (-d) and post here the output.

Normally squid-analyzer should detect that the file has changed and start from the begining of the file. If not there is a bug.

mkhallaf commented 9 years ago

Reproduced:

echo > /.../access.log

/.../squid-analyzer

DEBUG: the log statistics gathering took: 0 wallclock secs ( 0.00 usr + 0.00 sys = 0.00 CPU) DEBUG: generating HTML output took: 0 wallclock secs ( 0.00 usr + 0.00 sys = 0.00 CPU) DEBUG: total execution time: 0 wallclock secs ( 0.00 usr + 0.00 sys = 0.00 CPU)

/.../squid-analyzer -d

HISTORY TIME: Mon Oct 12 12:40:12 2015 - HISTORY OFFSET: 109923041 Starting to parse logfile /.../access.log. Reading file /.../access.log from offset 109923041 to end. No new log registered... DEBUG: the log statistics gathering took: 0 wallclock secs ( 0.00 usr + 0.00 sys = 0.00 CPU) Skipping HTML build. DEBUG: generating HTML output took: 0 wallclock secs ( 0.00 usr + 0.00 sys = 0.00 CPU) DEBUG: total execution time: 0 wallclock secs ( 0.00 usr + 0.00 sys = 0.00 CPU)

darold commented 9 years ago

This was a bug, it might be fixed with last commit 90aed4a. I'm publishing a new 6.3-1 release now as this is a major bug.