elastic / integrations

Elastic Integrations
https://www.elastic.co/integrations
Other
196 stars 427 forks source link

[traefik] Traefik v2 Access Logs format #9117

Open fmiqbal opened 7 months ago

fmiqbal commented 7 months ago

I've check that the integration only support for Traefik v1.6, current traefik version is v2.10,

this make the ingest pipeline file with backend_url error   There are only some fields that are different,

v1 v2 Description
FrontendName RouterName The name of the Traefik router
BackendName ServiceName The name of the Traefik backend
BackendURL ServiceURL The URL of the Traefik backend
BackendAddr ServiceAddr The IP:port of the Traefik backend

and there are new field notably TLSVersion and TLSCipher.

here is an example of v2 traefik access logs in json format


{
    "ClientAddr": "10.10.8.105:48376",
    "ClientHost": "103.250.14.10",
    "ClientPort": "48376",
    "ClientUsername": "-",
    "DownstreamContentSize": 88,
    "DownstreamStatus": 200,
    "Duration": 59518533,
    "OriginContentSize": 88,
    "OriginDuration": 59428568,
    "OriginStatus": 200,
    "Overhead": 89965,
    "RequestAddr": "api-students.unpad.ac.id",
    "RequestContentSize": 0,
    "RequestCount": 75,
    "RequestHost": "api-students.unpad.ac.id",
    "RequestMethod": "GET",
    "RequestPath": "/api/v1/study/140410210038/card/comment",
    "RequestPort": "-",
    "RequestProtocol": "HTTP/1.0",
    "RequestScheme": "http",
    "RetryAttempts": 0,
    "RouterName": "app-unpad-students-api-prod-app-unpad-students-api-api-students-unpad-ac-id-api@kubernetes",
    "ServiceAddr": "10.1.25.243:80",
    "ServiceName": "app-unpad-students-api-prod-app-unpad-students-api-80@kubernetes",
    "ServiceURL": {
        "Scheme": "http",
        "Opaque": "",
        "User": null,
        "Host": "10.1.25.243:80",
        "Path": "",
        "RawPath": "",
        "OmitHost": false,
        "ForceQuery": false,
        "RawQuery": "",
        "Fragment": "",
        "RawFragment": ""
    },
    "StartLocal": "2024-02-09T11:53:32.609696286Z",
    "StartUTC": "2024-02-09T11:53:32.609696286Z",
    "entryPointName": "web",
    "level": "info",
    "msg": "",
    "time": "2024-02-09T11:53:32Z"
}

and this is in common log format

180.252.162.107 - - [09/Feb/2024:11:49:44 +0000] "GET /api/v1/title_submission/180510210014?offset=0 HTTP/1.0" 200 197 "-" "-" 272 "app-unpad-students-api-prod-app-unpad-students-api-api-students-unpad-ac-id-api@kubernetes" "http://10.1.76.249:80" 181ms

Also There is possibility of missing BackendAddr (ServiceAddr) if it hit traefik internal metric / health (i suppose), because there will not be ServiceAddr field. it looks like this

{
    "ClientAddr": "10.10.8.54:13745",
    "ClientHost": "10.10.8.54",
    "ClientPort": "13745",
    "ClientUsername": "-",
    "DownstreamContentSize": 3281,
    "DownstreamStatus": 200,
    "Duration": 1755009,
    "GzipRatio": 0,
    "OriginContentSize": 0,
    "OriginDuration": 0,
    "OriginStatus": 0,
    "Overhead": 1755009,
    "RequestAddr": "10.10.8.51:9101",
    "RequestContentSize": 0,
    "RequestCount": 79,
    "RequestHost": "10.10.8.51",
    "RequestMethod": "GET",
    "RequestPath": "/metrics",
    "RequestPort": "9101",
    "RequestProtocol": "HTTP/1.1",
    "RequestScheme": "http",
    "RetryAttempts": 0,
    "RouterName": "prometheus@internal",
    "StartLocal": "2024-02-09T11:53:43.506206862Z",
    "StartUTC": "2024-02-09T11:53:43.506206862Z",
    "entryPointName": "metrics",
    "level": "info",
    "msg": "",
    "time": "2024-02-09T11:53:43Z"
}

For now I only change the ingest pipeline

  1. Open logs-traefik.access-1.11.1-format-json
  2. Add Drop in after temp parsing
    {
    "json": {
      "field": "event.original",
      "target_field": "temp"
    }
    },
    + {
    +  "drop": {
    +   "if": "ctx.temp.ServiceAddr == null"
    +   }
    + },
    ...
  3. Change FrontendName to RouterName and BackendAddr to ServiceAddr
  {
    "rename": {
-     "field": "temp.FrontendName",
+     "field": "temp.RouterName",
      "target_field": "traefik.access.frontend_name",
      "ignore_missing": true
    }
  },
  {
    "rename": {
-     "field": "temp.BackendName",
+     "field": "temp.ServiceAddr",
      "target_field": "traefik.access.backend_url",
      "ignore_missing": true
    }
  },

References

ltflb-bgdi commented 7 months ago

@fmiqbal I started refactoring this integration before my holidays already and will continue (and hopefully finish) today. FYI, take a look at this related issue as well.

Here are some questions I asked myself during my current work:

  1. Field renamings: Since the supported version got EOL in 2018, should we stick with Frontend and Backend field names or change them to Service and Router instead?
  2. I opt for following ECS best practices and nest service and router fields. E.g. service.name, service.url etc instead of using service_url etc.
  3. Most of the service url parameters are according to ecs url standard, with some additional fields like opaque and some raw... fiels. Should these extra fields be added as well or just dropped?
  4. I would not drop the "internal" events. In one of our use cases e.g. we use traffic as a redirect server and thus we would like to get the events as well. The pipeline should just not break if service fields are missing.
  5. My proposal above contains breaking changes. IMO this seems not to be a big issue, since based on the supported versions and not more issues reported until now, the integration does not seem to be heavily used so far.

Please let me know your opinion

Regards Bernhard

fmiqbal commented 7 months ago
  1. I'd prefer using Service and Router instead, the future (v3) version looks like also still use router and service https://doc.traefik.io/traefik/master/observability/access-logs
  2. This I can't comment because, but if you mean to make it nested instead of flat, I prefer nested, its just more beautiful to look at, although, service and request possibly be done like this, but router that only has one field (RouterName), will also be nested , or not ?
  3. Now that field doesnt really has description in the traefik docs, I lean to just drop the extra, I currently dont think that field is much use
  4. I agree, I think traefik should just has options to have internal in their access logs or not, and v3 version looks like it https://doc.traefik.io/traefik/master/migration/v2-to-v3/#internal-resources-observability-accesslogs-metrics-and-tracing
  5. Yes I agree
ltflb-bgdi commented 7 months ago

Related issue #8886