Open JoobyPM opened 9 months ago
Describe the bug prometheus-nginxlog-exporter made wrong regex for given format. Given nginx log format:
log_format main '$http_authorization bbs=$body_bytes_sent bs=$bytes_sent rl=$request_length rt=$request_time urt=$upstream_response_time $remote_addr - $http_x_forwarded_for $remote_user [$time_local] "$request" $status "$http_referer" "$http_user_agent"';
Here is explained regex101 - https://regex101.com/r/pjD94u/1
Does not match given format
^(?P<http_authorization>[^ ]*) bbs=(?P<body_bytes_sent>[^ ]*) bs=(?P<bytes_sent>[^ ]*) rl=(?P<request_length>[^ ]*) rt=(?P<request_time>[^ ]*) urt=(?P<upstream_response_time>[^ ]*) (?P<remote_addr>[^ ]*) - (?P<http_x_forwarded_for>[^ ]*) (?P<remote_user>[^ ]*) \[(?P<time_local>[^]]*)\] "(?P<request>[^"]*)" (?P<status>[^ ]*) "(?P<http_referer>[^"]*)" "(?P<http_user_agent>[^"]*)
$http_authorization - might have values Bearer\s\S+|Basic\s\S+|-
Bearer\s\S+|Basic\s\S+|-
To Reproduce
http_authorization
Bearer ...
Expected behavior For a given format, the correct regex will be:
^(?<http_authorization>\S+\s\S+|-) bbs=(?<body_bytes_sent>\d*|-) bs=(?<bytes_sent>\d*|-) rl=(?<request_length>\d*|-) rt=(?<request_time>[\d.]*|-) urt=(?<upstream_response_time>[\d.]*|-) (?<remote_addr>-|(?:\d{1,3}(?:\.\d{1,3}){3})|(?:[0-9a-fA-F:]{2,39})) - (?<http_x_forwarded_for>-|(?:\d{1,3}(?:\.\d{1,3}){3})|(?:[0-9a-fA-F:]{2,39})) (?<remote_user>[\S-]+) \[(?<time_local>[^\]]+)\] "(?<method>\S+) (?<url>.*?) (?<protocol>HTTP\/\d\.\d)" (?<status>\d{3}) "(?<http_referer>[^"]*|-)" "(?<http_user_agent>[^"]*)"$
For $http_authorization - (?<http_authorization>\S+\s\S+|-) instead of (?P<http_authorization>[^ ]*)
(?<http_authorization>\S+\s\S+|-)
(?P<http_authorization>[^ ]*)
Log file:
Feb 14 06:26:03 server-frontend nginx_exporter[9955]: 2024-02-14T06:26:03.036-0300#011error#011prometheus-nginxlog-exporter/main.go:302#011error while parsing line 'Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiIxMjM0NTY3ODkwIiwibmFtZSI6IkpvaG4gRG9lIiwiaWF0IjoxNTE2MjM5MDIyfQ.SflKxwRJSMeKKF2QT4fwpMeJf36POk6yJV_adQssw5c bbs=105 bs=318 rl=1139 rt=0.183 urt=0.184 111.111.111.111 - - - [14/Feb/2024:06:26:02 -0300] "GET /app-api/ping HTTP/1.1" 200 "-" "Mozilla/5.0 (Web0S; Linux/SmartTV) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/53.0.2785.34 Safari/537.36 WebAppManager"' : text log parsing err: access log line 'Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiIxMjM0NTY3ODkwIiwibmFtZSI6IkpvaG4gRG9lIiwiaWF0IjoxNTE2MjM5MDIyfQ.SflKxwRJSMeKKF2QT4fwpMeJf36POk6yJV_adQssw5c bbs=105 bs=318 rl=1139 rt=0.183 urt=0.184 111.111.111.111 - - - [14/Feb/2024:06:26:02 -0300] "GET /app-api/ping HTTP/1.1" 200 "-" "Mozilla/5.0 (Web0S; Linux/SmartTV) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/53.0.2785.34 Safari/537.36 WebAppManager"' does not match given format ^(?P<http_authorization>[^ ]*) bbs=(?P<body_bytes_sent>[^ ]*) bs=(?P<bytes_sent>[^ ]*) rl=(?P<request_length>[^ ]*) rt=(?P<request_time>[^ ]*) urt=(?P<upstream_response_time>[^ ]*) (?P<remote_addr>[^ ]*) - (?P<http_x_forwarded_for>[^ ]*) (?P<remote_user>[^ ]*) \[(?P<time_local>[^]]*)\] "(?P<request>[^"]*)" (?P<status>[^ ]*) "(?P<http_referer>[^"]*)" "(?P<http_user_agent>[^"]*)"
Environment:
Describe the bug prometheus-nginxlog-exporter made wrong regex for given format. Given nginx log format:
Here is explained regex101 - https://regex101.com/r/pjD94u/1
Does not match given format
$http_authorization - might have values
Bearer\s\S+|Basic\s\S+|-
To Reproduce
http_authorization
Bearer ...
)Expected behavior For a given format, the correct regex will be:
For $http_authorization -
(?<http_authorization>\S+\s\S+|-)
instead of(?P<http_authorization>[^ ]*)
Log file:
Environment: