elastic / elasticsearch

Free and Open Source, Distributed, RESTful Search Engine
https://www.elastic.co/products/elasticsearch
Other
1.23k stars 24.85k forks source link

haproxy grok pattern is incorrect #23134

Closed trtrmitya closed 6 years ago

trtrmitya commented 7 years ago

In your grok pattern for haproxy, you use: HAPROXYHTTPBASE %{IP:client_ip}:%{INT:client_port}

but haproxy can receive requests not only via tcp socket, but also via UNIX sockets.

In that case, instead of IP:PORT, there will be something like "unix:3" in the log. So this grok pattern will raise an exception failing to parse %{IP:client_ip}

talevy commented 7 years ago

Hi @trtrmitya, thanks for the info, I wasn't aware of this.

Do you have an example log? It would seem more appropriate to modify the HAPROXYTCP to allow unix:%{INT:unix_socket_id} instead of the ip/port.

haproxy patterns ref: https://github.com/elastic/elasticsearch/blob/5.2/modules/ingest-common/src/main/resources/patterns/haproxy#L39

trtrmitya commented 7 years ago

Sure, here is an example line:

Feb 14 00:00:00 smetanka5 haproxy[88855]: unix:1 [14/Feb/2017:00:00:00.049] MTDICT-front MTDICT-back/urozhaj5a 0/0/0/0/0 200 125 - - ---- 0/0/0/0/0 0/0 "POST /dicservice.json/lookup?srv=ios&lang=en-ru&ui=ru&flags=3 HTTP/1.1"

talevy commented 7 years ago

thanks @trtrmitya!

mind testing out this grok processor in this simulate request with your data and letting me know if it works well?

POST _ingest/pipeline/_simulate
{
  "pipeline": {
    "processors" : [
      {
        "grok": {
          "field": "message",
          "patterns": ["%{MYUNIXHAPROXYHTTP}"],
          "pattern_definitions": {
            "MYUNIXHAPROXYHTTPBASE": "unix:%{INT:client_port} \\[%{HAPROXYDATE:accept_date}\\] %{NOTSPACE:frontend_name} %{NOTSPACE:backend_name}/%{NOTSPACE:server_name} %{INT:time_request}/%{INT:time_queue}/%{INT:time_backend_connect}/%{INT:time_backend_response}/%{NOTSPACE:time_duration} %{INT:http_status_code} %{NOTSPACE:bytes_read} %{DATA:captured_request_cookie} %{DATA:captured_response_cookie} %{NOTSPACE:termination_state} %{INT:actconn}/%{INT:feconn}/%{INT:beconn}/%{INT:srvconn}/%{NOTSPACE:retries} %{INT:srv_queue}/%{INT:backend_queue} (\\{%{HAPROXYCAPTUREDREQUESTHEADERS}\\})?( )?(\\{%{HAPROXYCAPTUREDRESPONSEHEADERS}\\})?( )?\"(<BADREQ>|(%{WORD:http_verb} (%{URIPROTO:http_proto}://)?(?:%{USER:http_user}(?::[^@]*)?@)?(?:%{URIHOST:http_host})?(?:%{URIPATHPARAM:http_request})?( HTTP/%{NUMBER:http_version})?))?",
            "MYUNIXHAPROXYHTTP": "(?:%{SYSLOGTIMESTAMP:syslog_timestamp}|%{TIMESTAMP_ISO8601:timestamp8601}) %{IPORHOST:syslog_server} %{SYSLOGPROG}: %{MYUNIXHAPROXYHTTPBASE}"
          }
        }
      }
    ]
  },
  "docs": [
    {
      "_source": {
        "message": "Feb 14 00:00:00 smetanka5 haproxy[88855]: unix:1 [14/Feb/2017:00:00:00.049] MTDICT-front MTDICT-back/urozhaj5a 0/0/0/0/0 200 125 - - ---- 0/0/0/0/0 0/0 \"POST /dicservice.json/lookup?srv=ios&lang=en-ru&ui=ru&flags=3 HTTP/1.1\""
      }
    }
  ]
}

Seems to do the trick for the log line you shared. If this works out for you, I will add support for this in our patterns so that this extra pattern_definitions does not have to be done

thanks!

trtrmitya commented 7 years ago

Yes, this looks correct. Thank you.

elasticmachine commented 6 years ago

Pinging @elastic/es-core-infra

talevy commented 6 years ago

Closing due to success of the new pattern_definitions passed into this use-case. Maintaining updates to the haproxy patterns deserves a separate investigation and issue.