kassner / log-parser

PHP Web Server Log Parser Library
Apache License 2.0
334 stars 64 forks source link

Parse invalid requests strings #23

Closed Rud5G closed 6 years ago

Rud5G commented 8 years ago

169.229.3.91 - - [05/Jun/2016:15:26:51 +0200] "\x99\xf3\x0fF\xd9\xdde\xba\x97" 501 308 "-" "-"

kassner commented 8 years ago

Hi @Rud5G

Thank you for your contribution!

I'm keen to merge this, but I have a question regarding spaces.

curl -v "http://www.kassner.com.br/index.php?x=a b"

Results on:

xx.xx.xx.xx - - [14/Jun/2016:15:58:13 -0400] "GET /index.php?x=a b HTTP/1.1" 404 151 "-" "curl/7.43.0"

Using spaces in the request is "valid", as the nginx access log shows above, but we usually use %20 instead.

Any idea how to handle this? Your with your PR the test is broken because the ^\s match.

Test:

$entry = $parser->parse('169.229.3.91 - - [05/Jun/2016:15:26:51 +0200] "GET /index.php?x=a b HTTP/1.1" 404 308 "-" "curl/7.43.0"');
$this->assertEquals('169.229.3.91', $entry->host);
$this->assertEquals('-', $entry->logname);
$this->assertEquals('-', $entry->user);
$this->assertEquals('05/Jun/2016:15:26:51 +0200', $entry->time);
$this->assertEquals('/index.php?x=a b', $entry->request);
$this->assertEquals('404', $entry->status);
$this->assertEquals('308', $entry->sentBytes);
$this->assertEquals('-', $entry->HeaderReferer);
$this->assertEquals('curl/7.43.0', $entry->HeaderUserAgent);