google / mtail

extract internal monitoring data from application logs for collection in a timeseries database
Apache License 2.0
3.84k stars 378 forks source link

"Capture group reference appears to be unused" with integer names #581

Open filippog opened 3 years ago

filippog commented 3 years ago

Hello, I have noticed checker.go using integers to reference unused capture groups (together with the expected names). I can't quite understand why since AFAICS all capture groups are named with strings, this is the program I'm testing:

counter nginx_http_requests by host, vhost, method, code
counter nginx_http_requests_cache by host, vhost, cache_status

/(?P<hostname>[-0-9A-Za-z._:]+) nginx_access: (?P<vhost>[-0-9A-Za-z._:]+) \S+ (?P<remote_addr>[0-9a-f\.:]+) - - \[[^\]]+\] "(?P<request_method>[A-Z]+) (?P<request_uri>\S+) (?P<http_version>HTTP\/[0-9\.]+)" (?P<status>\d{3}) ((?P<response_size>\d+)|-) "[^"]*" "[^"]*" ([-0-9A-Za-z._:]+) ((?P<ups_resp_seconds>\d+\.\d+)|-) (?P<request_seconds>\d+)\.(?P<request_milliseconds>\d+) (?P<cache_status>\S+)/ {

  nginx_http_requests[$hostname][$vhost][$request_method][$status]++
  nginx_http_requests_cache[$hostname][$vhost][$cache_status]++
}

And rc47 shows this output:

mtail[367499]: I0905 15:15:34.294884  367499 main.go:114] mtail version 3.0.0-rc47 git revision 5e0099f843e4e4f2b7189c21019de18eb49181bf go version go1.16.5 go arch amd64 go os linux
mtail[367499]: I0905 15:15:34.295490  367499 main.go:115] Commandline: ["/usr/bin/mtail" "--progs" "/etc/mtail" "--logtostderr" "--port" "3903" "--poll_interval" "0" "--logs" "/dev/fd/3"] 
mtail[367499]: I0905 15:15:34.295633  367499 main.go:145] no poll interval specified; defaulting to 250ms poll
mtail[367499]: I0905 15:15:34.296227  367499 store.go:182] Starting metric store expiry loop every 1h0m0s
mtail[367499]: I0905 15:15:34.296629  367499 checker.go:253] capture group reference `http_version' at nginx.mtail:4:1-399 appears to be unused
mtail[367499]: I0905 15:15:34.296644  367499 checker.go:253] capture group reference `request_seconds' at nginx.mtail:4:1-399 appears to be unused
mtail[367499]: I0905 15:15:34.296648  367499 checker.go:253] capture group reference `request_milliseconds' at nginx.mtail:4:1-399 appears to be unused
mtail[367499]: I0905 15:15:34.296652  367499 checker.go:253] capture group reference `request_uri' at nginx.mtail:4:1-399 appears to be unused
mtail[367499]: I0905 15:15:34.296655  367499 checker.go:253] capture group reference `ups_resp_seconds' at nginx.mtail:4:1-399 appears to be unused
mtail[367499]: I0905 15:15:34.296658  367499 checker.go:253] capture group reference `remote_addr' at nginx.mtail:4:1-399 appears to be unused
mtail[367499]: I0905 15:15:34.296661  367499 checker.go:253] capture group reference `remote_addr' at nginx.mtail:4:1-399 appears to be unused
mtail[367499]: I0905 15:15:34.296664  367499 checker.go:253] capture group reference `request_uri' at nginx.mtail:4:1-399 appears to be unused
mtail[367499]: I0905 15:15:34.296668  367499 checker.go:253] capture group reference `response_size' at nginx.mtail:4:1-399 appears to be unused
mtail[367499]: I0905 15:15:34.296671  367499 checker.go:253] capture group reference `response_size' at nginx.mtail:4:1-399 appears to be unused
mtail[367499]: I0905 15:15:34.296674  367499 checker.go:253] capture group reference `request_seconds' at nginx.mtail:4:1-399 appears to be unused
mtail[367499]: I0905 15:15:34.296677  367499 checker.go:253] capture group reference `http_version' at nginx.mtail:4:1-399 appears to be unused
mtail[367499]: I0905 15:15:34.296680  367499 checker.go:253] capture group reference `8' at nginx.mtail:4:1-399 appears to be unused
mtail[367499]: I0905 15:15:34.296683  367499 checker.go:253] capture group reference `request_milliseconds' at nginx.mtail:4:1-399 appears to be unused
mtail[367499]: I0905 15:15:34.296687  367499 checker.go:253] capture group reference `10' at nginx.mtail:4:1-399 appears to be unused
mtail[367499]: I0905 15:15:34.296690  367499 checker.go:253] capture group reference `ups_resp_seconds' at nginx.mtail:4:1-399 appears to be unused
mtail[367499]: I0905 15:15:34.296693  367499 checker.go:253] capture group reference `11' at nginx.mtail:4:1-399 appears to be unused
mtail[367499]: I0905 15:15:34.296764  367499 runtime.go:188] Loaded program nginx.mtail
mtail[367499]: I0905 15:15:34.296775  367499 runtime.go:84] unmarking nginx.mtail 
mtail[367499]: I0905 15:15:34.296780  367499 runtime.go:84] unmarking systemd.mtail_
mtail[367499]: I0905 15:15:34.296812  367499 logstream.go:61] Parsed url as /dev/fd/3
mtail[367499]: I0905 15:15:34.296833  367499 tail.go:282] Tailing /dev/fd/3
mtail[367499]: I0905 15:15:34.297620  367499 mtail.go:126] Listening on [::]:3903

Specifically capture groups 8 / 10 / 11

Thank you !

jaqx0r commented 1 year ago

Internally the regexp engine assigns capture groups with a number regardless of if they have a name or not. mtail should do a better job of noticing the dual assignment and not complaining about the numeric capture group references.