darold / squidanalyzer

Squid Analyzer parses Squid proxy access log and reports general statistics about hits, bytes, users, networks, top URLs, and top second level domains. Statistic reports are oriented toward user and bandwidth control.
http://squidanalyzer.darold.net/
126 stars 36 forks source link

Parse old log files #84

Closed niccarp closed 9 years ago

niccarp commented 9 years ago

Hi Darold.

I dont know if this is posible but i want to parse old logs in order to insert in HTML reports because some days dont have correct informartion but give me error.

I remember some time ago, already do this maybe with new versions something has changed.

Already try it with rebuild options but with no luck.

Also i found in /tmp dir the file last_parsed.tmp its correct that exist this file after make a clean run of /usr/local/bin/squid-analyzer binary?

Its posible to parse old logfiles??

root@proxy:/var/log/squid3# /usr/local/bin/squid-analyzer -d hola.log
HISTORY TIME: Fri May 29 12:23:00 2015 - HISTORY OFFSET: 13441260
Starting to parse logfile hola.
DEBUG: this file will not been parsed: hola, line after offset is older than expected: Tue May 26 12:07:00 2015

Thanks for all your work!

darold commented 9 years ago

Hi,

You have first to move the history file /var/www/squidanalyzer/SquidAnalyzer.current into an other place, then give all your old log files ti squid-analyzer in the historical order and at end replace the history file with the one you had saved. elsewhere. Once this is done perform a rebuild to fix obsolete html pages.

Files in /tmp are temporary files, they are normally automatically removed at exit.

Regards,

niccarp commented 9 years ago

Great Darold.

Works perfect your steps. Now i have correct output.

Can i ask you one more question, i already add a Excluded sentence in /etc/squidanalyzer/excluded and the squidanalyzer.conf have the directive Exclude /etc/squidanalyzer/excluded

The line in file excluded is CLIENT 192.168.0..* also try it with #CLIENT 192.168.0.\d+

And the report HTML still shows the ip address for example 192.168.0.213

Already make a rebuild with this conf but seems dont take notice of the sentence. Could you help me with this someway?

Really thanks!

darold commented 9 years ago

The include/exclude filters are only applied during the squid log parsing. Once the line is in the data file it is not filtered again, I think this is why you still have those entries. I think I can make a patch to allow filtering on data file when --rebuild is enabled. I will let you know when it will be available.

niccarp commented 9 years ago

Perfect Darold,

meanwhile if i am right i can move all /var/www/squidanalyzer/YYYY/users/XXX.XXX.XXX.XXX and /var/www/squidanalyzer/YYYY/users/MM/XXX.XXX.XXX.XXX and when the parse have new data the IPS will be excluded.

Anyway i will be waiting for the patch, i keep the issue open as soon you get news.

Once more, really thanks.

darold commented 9 years ago

Last commit eab4c5a adds exclusion/inclusion definitions on old data when rebuild is used. Of course you have to remove manually the data file, but they will not appears in html report. Please use latest development code and let us know if there's any issues.

Best regards,

niccarp commented 9 years ago

Hi Darold.

I already patch with the last release and work perfect.

I finally use the Included Sentence with USER .*@domain.local expression in order to get only the data of logged users to domain.

Thanks! I close the comment.