kassner / log-parser

PHP Web Server Log Parser Library
Apache License 2.0
338 stars 64 forks source link

Blank referrers and agents #5

Closed robhoare closed 10 years ago

robhoare commented 10 years ago

The parser dies if either the referrer or agent in a log are blank (if they consist only of a pair of double quotes). There are not many cases where this happens (I had about six in a 3 million line test log file), but it does halt further processing.

As a workaround (which is probably slow) I change the blank referrer or agent before processing the line:

$line = str_replace('" ""','" " "',$line);
$line = str_replace('"" "','"-" "',$line);

Also, as you probably know (from the outstanding IPv6 issue), if there are any IPv6 addresses the parser will also fail. This includes even the localhost Ipv6 address, in lines like:

www.example.com:80 ::1 - - [27/Oct/2013:06:27:33 +0000] "OPTIONS * HTTP/1.0" 200 126 "-" "Apache/2.2.22 (Ubuntu) (internal dummy connection)"

A workaround for this is to search for the "::1 - -" and skip the line if it is present.