logstash-plugins / logstash-patterns-core

Apache License 2.0
2.17k stars 979 forks source link

Feat: make SQUID3 captures ecs compliant #270

Closed kares closed 4 years ago

kares commented 4 years ago

"1525334330.556 3 120.65.1.1 TCP_REFRESH_MISS/200 2014 GET http://www.sample.com/hellow_world.txt public-user DIRECT/www.sample.com text/plain 902351708.872"

matching before and after:

      "timestamp"=>"1525334330.556",
      "duration"=>"3",
      "client_address"=>"120.65.1.1",
      "cache_result"=>"TCP_REFRESH_MISS",
      "status_code"=>"200",
      "bytes"=>"2014",
      "request_method" => "GET",
      "url" => "http://www.sample.com/hellow_world.txt",
      "user"=>"public-user",
      "hierarchy_code"=>"DIRECT",
      "server"=>"www.sample.com",
      "content_type"=>"text/plain",
      "timestamp"=>"1525334330.556",
      "source"=>{"ip"=>"120.65.1.1"},
      "http"=>{
          "request"=>{"method"=>"GET"},
          "response"=>{"status_code"=>200, "body"=>{"bytes"=>2014}}
      },
      "destination"=>{"address"=>"www.sample.com"},
      "url"=>{"original"=>"http://www.sample.com/hellow_world.txt"},
      "user"=>{"name"=>"public-user",
      "squid"=>{
          "request"=>{"duration"=>3, "content_type"=>"text/plain"},
          "hierarchy_code"=>"DIRECT",
          "cache_result_code"=>"TCP_REFRESH_MISS"
      }
kares commented 4 years ago

this is going to get some ECS specs (for now the specs part in only for the legacy behavior - tests were missing). could you please do an ecs review for this - looking at the PR description with a sample log // cc @webmat @ebeahan

ebeahan commented 4 years ago

I've left some suggestions. A squid module was recently added in elastic/beats#19713, and I did some comparison against the expected test data there in addition to peeking at the Squid LogFormat docs.

kares commented 4 years ago

Thanks Eric, I've addressed your concerns :

The cache result code could be mapped to event.action: event.action: TCP_REFRESH_MISS

I'd opt for http.response.bytes vs http.response.body.bytes since both the HTTP response body and headers are counted in the size.

The content_type field refers to HTTP reply header, so I'd opt for using squid.response.content_type.

Been following Beats' repo from a tag so I missed squid support (on master), thanks for the links. Did not look in detail but compared to others I've been following Beats's squid support is a bit different. Probably lack a lot of context on the PR but (besides event.action) they match http.request.method under event.code.

webmat commented 4 years ago

Yeah I like this mapping.

Reviewing this made me realize we added file.mime_type a while ago, but we hadn't added http.[request|response].mime_type yet. I'll see if we can get those in for ECS 1.6. Watch out for it the next ECS release.

But for now this is good 👍

kares commented 4 years ago

Thanks Mat, I've added a check-list item to consider mime_type before we polish out the ECS efforts here.