Closed astorm closed 4 years ago
Hi. This case is similar to #49, which involves badly formed HTTP requests. Given they're not technically valid, I don't know how much value do you get parsing them, but the $parser->addPattern('%r', '(?P<request>.+)');
trick mentioned there is a good workaround if the main parsing failed.
I'd keep parsing logs with the format you have and have a second instance of LogParser
configured with the addPattern
and parse the line again to extract things like IP address and User-Agent
.
Something like:
$parser = new LogParser();
$parser->setFormat('%h %l %u %t "%r" %>s %b "%{Referer}i" "%{User-Agent}i"');
$laxParser = new LogParser();
$laxParser->setFormat('%h %l %u %t "%r" %>s %b "%{Referer}i" "%{User-Agent}i"');
$laxParser->addPattern('%r', '(?P<request>.+)');
foreach ($lines as $line) {
try {
try {
$entry = $parser->parse($line);
} catch (FormatException $e) {
$entry = $laxParser->parse($line);
}
} catch (FormatException $e) {
continue;
}
// process $entry
}
This case is similar to #49, which involves badly formed HTTP requests.
I'd keep parsing logs with the format you have and have a second instance ...
While it's not what I wanted to hear -- that's a fair philosophy. Closing out.
Hello there -- first off, thank you for building this and saving us all the trouble of building our own regular expressions to parse Apache's log files.
When I tried using this package on my actual real world Apache logs, it mostly worked. However, there were a number of different lines where it failed to parse logs and threw an exception in my program. Here's one example
My log format looks like this
Here's one line that failed to parse
and here's a few others
Is there a way to configure this library to be less strict when trying to parse these log lines?
If not, do you have any time/interest in enhancing the functionality of this library so it can handle cases like these?