kassner / log-parser

PHP Web Server Log Parser Library
Apache License 2.0
334 stars 64 forks source link

FormatException when line contains domain\\user_name #57

Open jiriermis opened 1 year ago

jiriermis commented 1 year ago

Hello,

i am getting Kassner\LogParser\FormatException when a line contains domain user name.

::1 - DOMAIN\\user_name [03/Feb/2023:18:50:42 +0100] "GET /app/index.php HTTP/1.1" 200 229931

Do you have any idea how to fix it please?

Thank you

jiriermis commented 1 year ago

I seems that it will work if you update pattern for %u like below

'%u' => '(?P<user>(?:-|[\\\\\_\w\-\.]+))',

kassner commented 1 year ago

I'm unsure how to trigger this in the wild with a standard Apache/Nginx installation for a long term fix, but if you're looking for a quick and clean workaround, you can use $parser->addPattern('%u', '(?P<user>(?:-|[\\\\\_\w\-\.]+))'); (see https://github.com/kassner/log-parser/blob/master/src/LogParser.php#L56-L60), that way you don't need to override the file in any way.

jiriermis commented 1 year ago

I tried the following custom patterns:

$parser->addPattern('%u', '(?P<user>(?:-|[\\\\\_\w\-\.]+))'); then $parser->getPCRE() contains '(?P<user>(?:-|[\\\_\w\-\.]+))' // There are missing back slashes.

$parser->addPattern('%u', '(?P<user>(?:-|[\\\\\\\\\_\w\-\.]+))'); then $parser->getPCRE() contains '(?P<user>(?:-|[\\\\\_\w\-\.]+))' // Now correct back slashes.

$parser->addPattern('%u', '(?P<user>(?:-|[\W\w\-\.]+))'); then $parser->getPCRE() contains '(?P<user>(?:-|[\W\w\-\.]+))'

But none of these works. I always get FormatException.

If I change the pattern directly in your LogParser.php like this '%u' => '(?P<user>(?:-|[\\\\\_\w\-\.]+))' or '%u' => '(?P<user>(?:-|[\W\w\-\.]+))', then it works properly.

It seems there is missing updating of the variable $pcreFormat in LogParser.php after calling function addPatern(...). I tried var_dump($this->pcreFormat) before throwing the FormatException and the PCRE doesn't contain added pattern, it still have the default patterns. That's why it worked when I changed the pattern directly in your file.

I tried to modify the function addPattern like below and then it worked

    public function addPattern(string $placeholder, string $pattern): void
    {
        $this->patterns[$placeholder] = $pattern;
        $this->updateIpPatterns();
        $this->setFormat(self::DEFAULT_FORMAT);
    }

But this would not use own format. You would have to have the variable $format in the construct have as class variable,

Thank you

SAH62 commented 1 year ago

My issue #59 appears to be a duplicate of this one.

kassner commented 1 year ago

Hi @jiriermis.

I've just merged #60 into master, so updating the pattern should be a bit less painful now. Let me know if that helped you in any way. Version 2.1.1 should include the updated code.

Thank you.

jiriermis commented 1 year ago

Hi @kassner

The parser doesn't work for common format of Apache log if the line contains backslashe.

e.g.: ::1 - DOMAIN\\user_name [03/Feb/2023:18:50:42 +0100] "GET /app/index.php HTTP/1.1" 200 229931

Two backslashes are normal in the apache log for remote user if there is running NTML authentication on the webserver.

I have to add the pattern like below:

$parser->addPattern('%u', '(?P<user>(?:-|[\W\w\-\.]+))'); This didn't work with the previous version.

It's just a bit of a shame that it can't do it by default.

Anyway thank you for your concern