elastic / apm

Elastic Application Performance Monitoring - resources and general issue tracking for Elastic APM.
https://www.elastic.co/apm
Apache License 2.0
374 stars 111 forks source link

If client closes socket prematurely, mark transaction in a special way #154

Open watson opened 4 years ago

watson commented 4 years ago

Description of the issue

It came up in elastic/apm-agent-nodejs#1411 that the Node.js agent doesn't record the fact that a client closes the socket to the app being monitored. Most (if not all?) agents just happily record the transaction as successful - which, from the point of view of the service it also kind of is.

It would, therefore, be a nice feature if, when we're tracing an incoming HTTP request and the client closes the socket prematurely, that the agent can detect this and mark the transaction. Today the transaction just looks like it was successful even though the client never received the response.

Possible ways to mark the transaction:

  1. Change the transaction result from the regular result (e.g. HTTP 2xx) to for example aborted
  2. Add a "mark" to the transaction when the socket was closed similar to how the RUM agent marks "time to first byte" etc.
  3. Invent a new field, e.g. socket.status: "aborted"

I don't think I'm a big fan of option 3 as not all transactions have a socket associated with them. I really like option 2 as it allows us to record exactly when it happened. Option 1 is nice as well as it allows us to easily filter by this and display it in the UI - though maybe that's not actually a use case?

What we are voting on

@elastic/apm-agent-devs Should the agents who can implement something like this?

Please also share your thoughts in the comments on which way you think this could best be implemented or if you think it shouldn't be something that is recorded.

Vote

Agent Yes No Indifferent N/A Link to agent issue
.NET
  • [ ]
  • [ ]
  • [ ]
  • [ ]
Go
  • [ ]
  • [ ]
  • [ ]
  • [ ]
Java
  • [ ]
  • [ ]
  • [ ]
  • [ ]
Node.js
  • [ ]
  • [ ]
  • [ ]
  • [ ]
Python
  • [ ]
  • [ ]
  • [ ]
  • [ ]
Ruby
  • [ ]
  • [ ]
  • [ ]
  • [ ]
RUM
  • [ ]
  • [ ]
  • [ ]
  • [x]
graphaelli commented 4 years ago

Does response.finished cover this case or is that insufficient? afaik only the node agent uses it but that could be stale information.

axw commented 4 years ago

I like the idea of adding a mark (option 2) as a minimal improvement, as it'll give people a simple visual means of understanding odd behaviour.

Option 1 sounds like it could be useful for identifying trends in unusual client behaviour, e.g. increasing rate of aborted transactions.

watson commented 4 years ago

@graphaelli

Does response.finished cover this case or is that insufficient? afaik only the node agent uses it but that could be stale information.

Good question. Currently response.finished and response.headers_sent is only used for errors. If the node agent captures an error and the agent records information about the associated HTTP request (if one exists). If the error happened before the HTTP headers were sent to the client response.headers_sent will be set to false, if it happened before the request was finished response.finished is set to false. So the node agent currently never set these properties on transactions.

I'm personally still most in favor of option 2, but it could maybe be combined with using this field if it gave the user extra info.

Qard commented 4 years ago

I'd go with a hybrid of 1 and 2. The searchability of 1 is nice, and the timeline display of 2 is nice too.