Open rsommer opened 1 year ago
Logstash information: Using logstash 8.6.1, installed as debian package from official elastic-repo
JVM:
$ java --version openjdk 11.0.18 2023-01-17 OpenJDK Runtime Environment (build 11.0.18+10-post-Debian-1deb11u1) OpenJDK 64-Bit Server VM (build 11.0.18+10-post-Debian-1deb11u1, mixed mode, sharing)
OS version:
$ uname -a Linux logstash 5.10.0-21-amd64 #1 SMP Debian 5.10.162-1 (2023-01-21) x86_64 GNU/Linux
Description of the problem including expected versus actual behavior: The default grok pattern for HTTPDUSER (derived from USER) does not match "" - which is a valid apache2 log output if the given remote user is empty (https://github.com/apache/httpd/blob/5c55d4c0600e7734030fa4d549913b4e94b2b0f2/modules/loggers/mod_log_config.c#L382)
""
Steps to reproduce:
curl -u :password --basic http://localhost:80/
10.0.2.100 - "" [31/Jan/2023:07:59:58 +0000] "GET / HTTP/1.1" 401 381
Using the following simple config:
input { stdin { } } filter { grok { match => { "message" => "%{COMMONAPACHELOG}" } } } output { stdout { codec => rubydebug } }
leads to:
The stdin plugin is now waiting for input: [2023-01-31T09:15:12,405][INFO ][logstash.agent ] Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]} 10.0.2.100 - "" [31/Jan/2023:07:59:58 +0000] "GET / HTTP/1.1" 401 381 { "@version" => "1", "@timestamp" => 2023-01-31T08:15:18.016253371Z, "message" => "10.0.2.100 - \"\" [31/Jan/2023:07:59:58 +0000] \"GET / HTTP/1.1\" 401 381", "event" => { "original" => "10.0.2.100 - \"\" [31/Jan/2023:07:59:58 +0000] \"GET / HTTP/1.1\" 401 381" }, "host" => { "hostname" => "localhost" }, "tags" => [ [0] "_grokparsefailure" ] }
Adjusting the HTTPDUSER pattern to HTTPDUSER %{EMAILADDRESS}|%{USER}|"" allows parsing of this valid apache2 logline. Running with patched httpd pattern file:
HTTPDUSER %{EMAILADDRESS}|%{USER}|""
The stdin plugin is now waiting for input: [2023-01-31T09:30:51,040][INFO ][logstash.agent ] Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]} 10.0.2.100 - "" [31/Jan/2023:07:59:58 +0000] "GET / HTTP/1.1" 401 381 { "@timestamp" => 2023-01-31T08:30:53.599445217Z, "source" => { "address" => "10.0.2.100" }, "url" => { "original" => "/" }, "@version" => "1", "http" => { "request" => { "method" => "GET" }, "version" => "1.1", "response" => { "status_code" => 401, "body" => { "bytes" => 381 } } }, "message" => "10.0.2.100 - \"\" [31/Jan/2023:07:59:58 +0000] \"GET / HTTP/1.1\" 401 381", "user" => { "name" => "\"\"" }, "host" => { "hostname" => "localhost" }, "timestamp" => "31/Jan/2023:07:59:58 +0000", "event" => { "original" => "10.0.2.100 - \"\" [31/Jan/2023:07:59:58 +0000] \"GET / HTTP/1.1\" 401 381" } }
Logstash information: Using logstash 8.6.1, installed as debian package from official elastic-repo
JVM:
OS version:
Description of the problem including expected versus actual behavior: The default grok pattern for HTTPDUSER (derived from USER) does not match
""
- which is a valid apache2 log output if the given remote user is empty (https://github.com/apache/httpd/blob/5c55d4c0600e7734030fa4d549913b4e94b2b0f2/modules/loggers/mod_log_config.c#L382)Steps to reproduce:
curl -u :password --basic http://localhost:80/
10.0.2.100 - "" [31/Jan/2023:07:59:58 +0000] "GET / HTTP/1.1" 401 381
Using the following simple config:
leads to:
Adjusting the HTTPDUSER pattern to
HTTPDUSER %{EMAILADDRESS}|%{USER}|""
allows parsing of this valid apache2 logline. Running with patched httpd pattern file: