Closed catbadger closed 5 years ago
so to leave more data, the logs i want to parse look like this, and I'm having a heck of a time figuring out how to do the format string...
Mar 8 19:36:37 ip-172-31-16-77 haproxy[2168]: 666.666.666.666:666 [08/Mar/2019:19:36:37.629] http-in~ wordpress/webv2 205/0/1/1/207 200 6826 - - ---- 8/8/1/1/0 0/0 "GET /wp-includes/js/underscore.min.js?ver=1.8.3 HTTP/1.1"
Hi @catbadger
You can create a custom format like Joshua mentions in https://github.com/kassner/log-parser/issues/39
You'll have to create a few patterns and assign them to a name in order for them to match, and then just not use them later on.
So, if you do something like this:
$parser = new \Kassner\LogParser\LogParser();
$parser->addPattern('%GBG1', '(?P<gb1>[a-zA-Z]+\s+\d+ \d+\:\d+\:\d+)');
$parser->addPattern('%GBG2', '(?P<gb2>[a-zA-Z]+\[\d+\]\:)');
$parser->setFormat('%GBG1 %h %GBG2 %a:%p');
var_dump($parser->parse('Mar 8 19:36:37 ip-172-31-16-77 haproxy[2168]: 1.2.3.4:5678'));
You will get an object like this:
object(stdClass)#3 (5) {
["gb1"]=>
string(15) "Mar 8 19:36:37"
["host"]=>
string(15) "ip-172-31-16-77"
["gb2"]=>
string(14) "haproxy[2168]:"
["remoteIp"]=>
string(7) "1.2.3.4"
["port"]=>
string(4) "5678"
}
Then you can just ignore gb1
and gb2
.
A log file contains some garbage columns that i don't need to work with. How do I skip the bad columns?