eldy / AWStats

AWStats Log Analyzer project (official sources)
https://www.awstats.org
376 stars 120 forks source link

Bug in latest 7.6 and NCSA Log Format: URL-Strings gets truncated after the first blank! #58

Open BMEIA opened 7 years ago

BMEIA commented 7 years ago

There is a problem with NCSA LogFormat 4 in combination with URLs that have blanks. The URL-Strings gets truncated after the first blank although it is included inside quotes!

Example:

LogFormat=4 (#LogFormat = "%host %other %logname %time1 %methodurl %code %bytesd")

172.30.22.5 - tom.smith [03/Jan/2016:10:39:06 +0100] "GET /Download/OmniaBehandlung elektronischer Geschäftsstücke__Ergänzung 2016.pdf HTTP/1.1" 200 96063

It tracks truncated as "Page-URL": /Download/OmniaBehandlung

instead correct as File under "Downloads": /Download/OmniaBehandlung`elektronischer `Geschäftsstücke__Ergänzung 2016.pdf

So all statistics for Page-URL and Downlads counted wrong!

BMEIA commented 7 years ago

In the meantime I found the reason and a solution:

awstats.pl Line 9025: $PerlParsingFormat = "([^ ]+) [^ ]+ (.+) \\[([^ ]+) [^ ]+\\] \\\"([^ ]+) ([^ ]+)(?: [^\\\"]+|)\\\" ([\\d|-]+) ([\\d|-]+)";

has to be changed to: $PerlParsingFormat = "([^ ]+) [^ ]+ (.+) \\[([^ ]+) [^ ]+\\] \\\"([^ ]+) (.+) [^\\\"]+\\\" ([\\d|-]+) ([\\d|-]+)";

Would be appropriate to be fixed in the next version!

eldy commented 7 years ago

@BMEIA It seems with github rendering code, some \ are lost. To be sure, can you send me your awstats.pl file after the change to eldy@users.sourceforge.net ?

BMEIA commented 7 years ago

Sorry, that´s true. I corrected it in my post above and sent you in addition the awstats.pl file via mail!