elastic / elasticsearch

Free and Open Source, Distributed, RESTful Search Engine
https://www.elastic.co/products/elasticsearch
Other
1.07k stars 24.83k forks source link

(Tasks API) report user agent under headers #112845

Closed stefnestor closed 3 weeks ago

stefnestor commented 1 month ago

Description

👋 howdy, team!

Our Elastic Cloud logs user_agent which has been a helpful heuristic in determining inducing upstream traffic. According to google, this user agent is a HTTP header, which makes it sound eligible to be passed into the List Tasks emitted header.

This would help close the gap especially for on-prem services to determine traffic instigators, but it would also be helpful in Cloud to enable customer self-service on tracing traffic to its upstream requestor to reduce related Support investigation times.

Building off today's example to Logstash for similar data ballpark logstash#16448, this would appear

$ cat tasks.json | jq '.nodes[].tasks[]|select(.action=="indices:data/write/bulk")|select(.headers."X-elastic-product-origin"=="kibana")'
{
  "node": "XXXXX",
  "id": 194746221,
  "type": "transport",
  "action": "indices:data/write/bulk",
  "description": "requests[5], indices[.internal.alerts-observability.metrics.alerts-default-000001]",
  "start_time": "2024-08-20T21:02:38.357Z",
  "start_time_in_millis": 1724187758357,
  "running_time": "2.7s",
  "running_time_in_nanos": 2779492550,
  "cancellable": false,
  "headers": {
    "X-elastic-product-origin": "kibana",
    "trace.id": "258cc02b3387a34fade912fbdXXXXXXX",
    "X-Opaque-Id": "unknownId;kibana:task%20manager:run%20alerting%3Ametrics.alert.threshold:53b10185-XXXX-4494-b859-964d1b7XXXXX",
+    "user-agent": "Kibana v8.15.0" 
  }
}

TIA! 🙏

elasticsearchmachine commented 1 month ago

Pinging @elastic/es-distributed (Team:Distributed)

DaveCTurner commented 3 weeks ago

We (the @elastic/es-distributed-coordination team) discussed this today. It is of course technically feasible to copy whatever headers one might want into the task context but each one carries some overhead (both computational and conceptual) so we'd want to draw the line somewhere. Given this thinking, our general feeling was that this niche is already filled with the X-Opaque-Id (and X-elastic-product-origin) fields, as well as the integration with APM. Together, these existing fields already carry details about the requesting application in most cases, typically way more detail than just the user-agent header. In cases where they aren't sufficient, we'd rather treat that as a deficiency in the calling application instead of adding workarounds in Elasticsearch itself. Thus we decided to close this without taking action.