elastic / apm-agent-dotnet

https://www.elastic.co/guide/en/apm/agent/dotnet/current/index.html
Apache License 2.0
581 stars 207 forks source link

Expose fields and objects from `Transaction.Context` through the public agent API #124

Closed gregkalapos closed 5 years ago

gregkalapos commented 5 years ago

Why?

The current public agent API only exposes a small part of the model that the APM Server expects. These are mainly things directly on Transaction and Span. But Transaction.Context and Span.Context are not exposed. On those contexts we can set things like HTTP request related fields (HTTP method, URL, etc.) or Database related fields.

Since there are lots of libraries that we currently don't support with auto instrumentation we should expose these fields and let users set those when they rely on the public agent API.

This PR already contains a use case: https://github.com/elastic/apm-agent-dotnet/pull/118

Problem description

To avoid a too high-level discussion, I'll focus specifically on Transaction.Context.Request, but we should came up with a solution that works with other things on Transaction.Context and Span.Context.

Request has 2 required fields:

This means that both of these fields must be set, otherwise the the server would reject the Transaction.

Requirements

(not set it stone, feel free to disagree)

Potential solutions:

1. Lazy initialization (original attempt)

public interface ITransaction
{
   IRequest Request { get; }
}

internal class Transaction : ITransaction
{
  public IRequest Request => _context.Value?.Request;
}

internal class Context
{
  private readonly Lazy<Request> _request = new Lazy<Request>();
  public Response Response => _response.Value;
}

Advantage:

Disadvantage:

2. Exposing 2 methods on ITransaction:

public interface ITransaction
{
   void SetRequest(string method, string urlProtocol, string urlFull, string urlHostName, 
      string urlRaw = null, bool socketEncrypted = false, string socketRemoteAddress = null,
      string httpVersion = null, object body = null);

   (string method, string urlProtocol, string urlFull, string urlHostName, string urlRaw, 
      bool socketEncrypted, string httpVersion, object body) GetRequest();
}

internal class Request
{
 // with a ctor we force to pass required fields. 
  public Request(Url url, string method)
  {
    Url = url;
    Method = method;
  } 
  public string HttpVersion {//get&set}
  public object Body {//get&set}
  public string Method {//get&set}
  public Socket Socket { get; set; }
  public Url Url { get; set; }
}

internal class Transaction : ITransaction
{
  public void SetRequest(string method, string urlProtocol, string urlFull, string urlHostName, 
      string urlRaw = null, bool socketEncrypted = false, string socketRemoteAddress = null, 
      string httpVersion = null, object body = null)
     => _context.Value.Request = new Request(new Url() { Full =  urlFull, Protocol =  urlProtocol, 
         Raw =  urlRaw, HostName =  urlHostName}, method){Body =  body, Socket
          =  new Socket() { Encrypted =   socketEncrypted, RemoteAddress =  socketRemoteAddress},
         HttpVersion =  httpVersion};

}

public (string method, string urlProtocol, string urlFull, string urlHostName, string urlRaw, bool socketEncrypted,
  string httpVersion, object body)
    GetRequest() => _context.Value.Request == null ?
      (string.Empty, string.Empty, string.Empty, string.Empty, string.Empty, false, string.Empty, null) :
        (_context.Value.Request.Method, _context.Value.Request.Url.Protocol, _context.Value.Request.Url.Full,
          _context.Value.Request.Url.HostName, _context.Value.Request.Url.Raw, _context.Value.Request.Socket.Encrypted,
          _context.Value.Request.HttpVersion, _context.Value.Request.Body);

Here is how the user code'd look like:

transaction.SetRequest(method: "GET", urlProtocol: "HTTP", urlFull:"https://myUrl.com", urlHostName: "MyUrl", httpVersion: "2.0");

 //later in the code:
var request = transaction.GetRequest();
Console.WriteLine(request.method);
Console.WriteLine(request.httpVersion); //etc...

Advantages:

Disadvantage:

One modification of this approach would be to use specific types as parameters.

3. Introducing intermediate types that work as public API

Implemented in elastic/apm-agent-dotnet#130

4. exposing the intake API as it is.

Implemented in https://github.com/elastic/apm-agent-dotnet/pull/134

@elastic/dotnet: maybe someone has an idea, opinion or just a bright comment.

Mpdreamz commented 5 years ago

Yes NullReferenceException is bad but we should not be too clever here IMO.

If you program directly with the public API nulls (just like anything else with C#) should be part of the game. I particularly dislike returning empty strings for properties that were not set.


public class Transaction
{
    internal Context Context { get; } = new Context();
    public Request Request 
    {
        get=> Context.Request;
        set => Context.Request = value;
    }
}

public class Request 
{
  public Request(Url url, string method) => (Url, Method) = (url, method);

  public string Method { get; }
  public Url Url { get; }

  public string HttpVersion { get; set; }
  public object Body { get; set; }
  public Socket Socket { get; set; }
}

The statement that the following should never throw:

Agent.Tracer.CurrentTransaction.Request.Method = "GET";

Almost forces us to a design pattern that would be a poor implementation of functional lenses in C#. I don't think making this a requirement does our users any good in the long run. The nodejs agent also documents the fact some properties can be null if we need prior art to validate this design choice.

gregkalapos commented 5 years ago

The statement that the following should never throw:

Agent.Tracer.CurrentTransaction.Request.Method = "GET";

Almost forces us to a design pattern that would be a poor implementation of functional lenses in C#.

I agree on that. If we have a property chain like that then it's impossible to avoid nullrefs, or it'd be very strange. My intention with saying this was rather this: maybe we should find a way to avoid such properties. And the 2. proposed solution avoids that property chain.

I think we can consider your code snippet as a 3. solution.

Here is what I think: Advantage:

Disadvantage:

A variation of this would be to not expose model classes that are expected by the server, but using API classes instead. I'd consider this option.

Also to:

I particularly dislike returning empty strings for properties that were not set.

We can also return nulls instead of empty string... the point of the 2. proposed solution is to hide the model type and force the users to pass every required field on the API level, but not to avoid null in the "end of the property chain". I'm totally fine with returning (null, null, null, ... , null).

gregkalapos commented 5 years ago

Discussed this with @SergeyKleyman and @Mpdreamz.

Decision: We expose the model that we have to the APM server with the Intake API as it is and we won't introduce any intermediate layer on the API. (proposed solution 4.)

Mainly because adding an intermediate layer (like in #130) would mean that we have to maintain a mapping, which is not trivial.

Regarding breaking changes: we assume that on the intake API we will only have breaking changes on major versions and in those cases we will also have breaking changes in the C# API which is acceptable.

The other advantage is that everything which is in the server docs and in the intake API doc automatically applies to the .NET API.

cc: @elastic/apm-agent-devs @roncohen We decided to expose most part of the intake API as it is in the .NET API (e.g.: transaction.context.request can be directly set in C#). If someone has a good argument to not to do this, this would be a good time to let us know.

watson commented 5 years ago

For what it's worth, the Node.js agent allows users to modify any part of the payload before it's sent to the APM Server - not only context.request - through filters: https://www.elastic.co/guide/en/apm/agent/nodejs/current/agent-api.html#apm-add-filter

gregkalapos commented 5 years ago

For what it's worth, the Node.js agent allows users to modify any part of the payload before it's sent to the APM Server - not only context.request - through filters: https://www.elastic.co/guide/en/apm/agent/nodejs/current/agent-api.html#apm-add-filter

Nice! Thx @watson!

In order to implement this in .NET we need to make those types (context, request, etc.) public anyway, so this is aligned with our plan. Whether we build a similar concept is a different question, but it's worth to consider. But as the prerequisite first we need to expose our types.

skynet2 commented 5 years ago

Probably the best solution! Thanks, as for me, a developer should be responsible for what he is doing, Apm client should have good error logging, so we can see errors from apm server. Is this possible to use somehow https://docs.microsoft.com/en-ca/dotnet/api/microsoft.extensions.logging.ilogger-1?view=aspnetcore-2.2 ? current implementation for logger is only for console

gregkalapos commented 5 years ago

Probably the best solution! Thanks, as for me, a developer should be responsible for what he is doing, Apm client should have good error logging, so we can see errors from apm server. Is this possible to use somehow https://docs.microsoft.com/en-ca/dotnet/api/microsoft.extensions.logging.ilogger-1?view=aspnetcore-2.2 ? current implementation for logger is only for console

Thanks for the feedback @skynet2!

Regarding logging: https://github.com/elastic/apm-agent-dotnet/issues/61 is about exactly that topic. We plan to offer more than a console logger, we already have most of the infrastructure for that.

gregkalapos commented 5 years ago

PR ready for review in #134.

axw commented 5 years ago

@gregkalapos I don't know how much of this applies to .NET, but since you asked: I originally had exposed the model types in the Go agent, and ended up hiding them and adding an API.

The main reason for was performance. The model types in the Go agent store some of the data in unconventional ways in order to minimise memory allocations in the first place, enable object pooling/reuse, and to speed up JSON encoding. We hide all of this behind a more conventional API, doing the conversions and object reuse behind the scenes.

It also means that the breaking model changes don't necessarily require breaking the API and user's code. While that should be fairly infrequent, it's still not nice and may be an impediment to users (especially third-party instrumentation modules) upgrading.

felixbarny commented 5 years ago

We hide all of this behind a more conventional API, doing the conversions and object reuse behind the scenes.

Does the API allow to set all fields of the intake API? Is it possible to retrieve values from it?

It also means that the breaking model changes don't necessarily require breaking the API and user's code.

That's a good point. Especially considering the uncertainty of API changes related to conforming to ECS like whether or not we want to drop context, rename tags to labels etc.

axw commented 5 years ago

Does the API allow to set all fields of the intake API? Is it possible to retrieve values from it?

There are a couple of fields which I didn't provide an API for, because they aren't sensible for Go. Otherwise you can set them all. Sometimes it's a little bit indirect, like you would use transaction.Context.SetHTTPRequest(request) to set the request URL, method, headers, version, cookies, and socket.

In general, the API is write-only. There's a handful of exceptions (transaction name, type, duration, and result), which are fields primarily for aesthetics. They aren't directly part of the model type, but get copied across at serialisation time. It's worth noting that the Go agent does not have any kind of "filter" mechanism whereby users can transform the events.

felixbarny commented 5 years ago

The main reason for was performance. The model types in the Go agent store some of the data in unconventional ways in order to minimise memory allocations in the first place, enable object pooling/reuse, and to speed up JSON encoding.

This is a consideration for Java as well. We're using types which facilitate object reuse like CharBuffers, StringBuilders etc. But we don't want to expose those as it would introduce too tight coupling and increases the risk of breaking changes.

Currently, we don't have an option to set context fields via the API at all but we are considering options to allow for that. It would also be a write-only API so implementing filters would be challenging.

gregkalapos commented 5 years ago

The main reason for was performance. The model types in the Go agent store some of the data in unconventional ways in order to minimise memory allocations in the first place, enable object pooling/reuse, and to speed up JSON encoding.

I see! Interesting. In .NET I don't see any reason for having an extra layer in order to do this. We can do pooling/reuse also when we expose the types that map 1-1 to the server API. I guess this is more like a Go specific thing.

It also means that the breaking model changes don't necessarily require breaking the API and user's code.

That's a good point. Especially considering the uncertainty of API changes related to conforming to ECS like whether or not we want to drop context, rename tags to labels etc.

I personally agree on this one very much, that was the main intention to my original PR (#130). But when we discussed this with @Mpdreamz and @SergeyKleyman we agreed, that those breaking changes only should happen on ~minor~ major server version changes, and we are ok with breaking the API in that case. Another advantage to expose the server API is that if someone looks at the json schema they can immediately use the Agent API, because it's the same. But I personally agree on this one.

~@felixbarny what's the plan in Java? I think most of the stuff isn't exposed currently, right?~ Update: Sorry, have internet trouble, I see your answer

Let me sum up what we learned until now:

Mpdreamz commented 5 years ago

we agreed, that those breaking changes only should happen on minor server version changes...

Small correction major version changes on the server. :)

gregkalapos commented 5 years ago

we agreed, that those breaking changes only should happen on minor server version changes...

Small correction major version changes on the server. :)

👍 Yes, typo, sorry.

felixbarny commented 5 years ago

In .NET I don't see any reason for having an extra layer in order to do this. We can do pooling/reuse also when we expose the types that map 1-1 to the server API. I guess this is more like a Go specific thing.

One good example where the internal representation of the Java agent deviates from the JSON is Transaction.context.request.body. We differ between three cases:

As you can see, there is no 1:1 mapping of the internal model and the intake API. The used types are also quite unconventional and might change at minor or even bugfix versions. This is highly Java specific but maybe you have similar challenges in .NET.

gregkalapos commented 5 years ago

Yes, we also have this challenge in .NET.

But is this really related to public vs. private API?

If I understand correctly you just store body as object and you return either the CharBuffer or the other thing depending on what's stored internally. https://github.com/elastic/apm-agent-java/blob/master/apm-agent-core/src/main/java/co/elastic/apm/agent/impl/context/Request.java#L92

I guess in .NET we would just expose it as it is. We can discuss whether exposing object is a good idea (maybe not :) ). This'd be something like a discriminated union - but we don't have it in C#.

Do you suggest to solve this with a class that is part of the public API and solves the issue with a different type that is smarter and hides the real request.body somehow?

felixbarny commented 5 years ago

The Java agent does not expose the different body data structures at all. What you have linked to is the internal API. The public API could abstract this by just allowing to set the body as a String, for example. This is similar to the Span name. Internally, it's represented as a StringBuilder this allows to append different parts to the name, like name.append(method).append(" ").append(host) without allocating objects by concatenating strings. The public API however only has a setName(String) method, not offering access to the internal StringBuilder. That makes it easier to make internal changes, like using a CharBuffer or just a String instead of a StringBuilder, without introducing breaking changes.

gregkalapos commented 5 years ago

Got it! Yeah, that's how I imagined.

For .NET people: We could still do this with a single class.

public class Span
{
     [JsonIgnore] //internal is ignored anyway...
     internal StringBuilder NameInternal {get; set; } = new StringBuilder();

     public string Name { get => _sb.ToString(); set => _sb = new StringBuilder(value); }
}

Everything that is public would 1-1 map to the server API, the reming stuff is only used internally and nothing is serialized from that. With this:

Note: I personally still lean towards separating the server API model classes from the public API by having different classes in different namespaces, so I just discuss here according to our common decision where we outvoted #130.

gregkalapos commented 5 years ago

Merged https://github.com/elastic/apm-agent-dotnet/pull/134. That covers Transaction.Context. Will continue with the same strategy on Span.

gregkalapos commented 5 years ago

Done and merged.

We went with "4. exposing the intake API as it is.". Both Transaction.Context and Span.Context are exposed to users through the Public Agent API.