create a config that allows to exclude spans from transactions

devcorpio commented 2 years ago

There are cases where our users are interested in exclude certain spans from a transaction.

In order to achieve that we are going to create a new configuration named ignoreSpans. It will work the same way as ignoreTransactions config

kyungeunni commented 2 years ago

There are two approaches I could think of:

1. Checks `ignoreSpans` on Span creation

When creating a Span, check the ignoreSpans config and do not create or push it to a transaction when it matches.

cons: There are several ways to create the span, you can use transaction.createSpan(), new Span() or other API. So it'd be challenging to make sure we find all the places where the span creation happens and add checks there. Or we need to introduce a single method that creates spans and discourages appending or creating spans manually and append it directly to a transaction. In this case, we would need to enforce this behavior by changing spans field to private so that it can not be added manually(e.g.tr.spans.push(span)) but it could be incompatible where the spans field needs to be accessed from the outside.
pros: With this approach, we don't need an additional step where we filter out spans at the later stages. If we create a single entry for creating spans, it could be useful when we need to add a hook or some common functions for span creation in the future.

2. Filter out in `transaction:end` event

In transaction:end event handler when calculating transaction duration, we can filter out all the spans that need to be ignored before sending the transaction:end event.

cons: Compared to approach number 1, it'd use more memory and introduce an additional iteration in transaction:end.
pros: It seems like minimal change is needed. After all, the cons mentioned above are not a big deal. (I think!)

I'd prefer to choose path number 2 going forward. Let me know what you think @devcorpio @vigneshshanmugam !

vigneshshanmugam commented 2 years ago

Filter out in transaction:end event

This can already be done by the user by using observe and addFilter API and not so beneficial for the RUM agent interms of managing memory and keeping track of that particular span.

My vote is for the proposal 1, Reason being Users cannot create spans by using arbitrary API's. We have a single API to create span and that is via the documented transaction.startSpan - , if we add the filtering there that should be good enough.

@devcorpio Would like to hear your opinion as well.

devcorpio commented 2 years ago

Thanks both for sharing your thoughts! 🎉

TL;DR:

My vote goes for proposal 1 while only focusing on spans created by the agent. (the ones a user cannot control)

Reasoning behind my vote:

I agree that at first glance the second proposal might seem a minimal one and then the way to go. Nevertheless, I tend to dislike the idea of having a span hanging around (while keeping tracking and performing operations, etc. which will not be used for anything) during the whole lifecycle of a transaction when we have the opportunity of discarding it from the start.

When it comes to the logic in charge of checking the ignoreSpan we need to bear in mind things such as:

Is the agent the one creating the span that should be ignored? (This was precisely the main reason behind this issue)
Is the user creating a span that should be ignored? (This seems a bit weird since the user have the control here anyway)

1. Spans created by the agent

The existing modus operandi here (at least the most common) tends to create spans that will be eventually pushed in a data structure that transaction model exposes. example 1 and example 2.

Probably it would be a good idea to change the approach here. For instance (it can be a different thing, I'm just sharing an idea), using a method rather than mutating the structure directly.

2. Spans created by the user

As @vigneshshanmugam commented, using the public api we expose in the agent, the only way to create spans is via startSpan.

On the other hand, does it make sense to check the ignoreSpan config in this case? As mentioned above, the user has control here anyway. The only case that I can think of as something that might make sense is "I would like to stop sending the spans that I just created for blocking the transaction". But in this case such span should be discarded in the logic that handles the end of the transaction and then we should rethink that maybe proposal 2 (through this) is the one to apply 🧐

To be totally honest, I wouldn't take into account spans created by users, it seems pretty unusual scenario. I would focus on spans created by the agent.. If we decide to do this, perhaps we should emphasise that nuance in the new documentation

What do you think?

kyungeunni commented 2 years ago

Thank you both for sharing your thought!

@vigneshshanmugam you mentioned:

This can already be done by the user by using observe and addFilter API

I think the reason this issue is created is that calculating the transaction time happens prior to the addFilter or observe callback is called so the total time includes the times spent for ignored spans as well.

I agree with what @devcorpio mentioned above that we should focus on the spans created by the agent. Also, In my humble opinion, it is important to have consistency for any span, either created by the agent or by users, so there is no surprise for users when using our product. (for example, if I was a user, when I specified ignoreSpans config, I would expect the config applied to ALL spans, not those only created by the agent or created by me. If the config works differently by arbitrary rule, we need to document it and expect the users to find it out)

After hearing your comments, I'm happy to explore option 1 and try to come up with a way we can effectively centralize the place checking ignoreSpans config so that we won't need to check for all the places that spans are created throughout the codebase, and that would also help not introducing a different behavior for the spans created by users and agents.

xirdneh commented 1 year ago

Hi, I'm a user and I'd like to say I'm very interested about this feature.

The reason is because we create a bunch of custom transactions from the front end so we can add enough labels and context to make it easier to debug stuff and also because a lot of user initiated actions are not correctly instrumented by the agent when using things like Redux and RTK because a of the network calls happen async and at that point the agent closes the auto-instrumented transaction. I don't think this last point is a downside of the agent, rather is an app implementation detail. So, the current behavior should be expected.

Now, the issue is that while we want to keep the custom transaction running until we're sure all side-effects are done there's a bunch of calls to other tracking API's and other stuff that's not important for the specified user action and this causes our custom transactions to be polluted with a bunch of data that doesn't help with debugging or understanding how the app works.

I agree that the troublesome spans are automatically created.

Also, is there any way to do this right now? Is there a way to drop automatically created spans using adsFilter?

Thanks for all y'all's work!

kyungeunni commented 1 year ago

Hi @xirdneh! Thanks for your interest!!

Yes, for the time being you might consider using addFilter API to filter out spans you'd like to exclude. I think the example shown in the link might give you an idea of how to access the transactions and spans, but here's the pseudo-code to demonstrate how you could manipulate the payload:

apm.addFilter(function (payload) {
  if (payload.errors) {
    payload.errors.forEach(function (error) {
      error.exception.message = error.exception.message.replace('secret', '[REDACTED]')
    })
  }
  if (payload.transactions) {
    payload.transactions.forEach(function (tr) {
      const filteredSpans = tr.spans.filter(span => shouldIncludeSpan);
      tr.spans = filteresSpans;
    })
  }
  return payload
})
...
function shouldIncludeSpan(span) {
    // some logic to filter span
    return true;
}

I will also keep you updated with the progress on the ignoreSpans. Thanks!!

xirdneh commented 1 year ago

@kyungeunni Thanks, this works great!

for future references it would be great if spans had attributes to know if it's an HTTP request span and to easily grab the HTTP verb and URL. This would make it easier to filter out the unwanted spans. For now we had to do something like this to figure out if a span was an HTTP request and which URL it is using:

    apmClient.addFilter((payload) => {
      if (!payload.transactions) return payload;
      payload.transactions.forEach((transaction) => {
        const filteredSpans = transaction.spans.filter((span) => {
          const spanNameParts = span.name.split(' ');
          const url = spanNameParts.find((part) => part.startsWith('https://'));
          if (!url) return span;
          const spanUrl = new URL(url);
          if (!hostnamesToExclude.includes(spanUrl.hostname)) return span;
        });
        transaction.spans = filteredSpans;
      });
      return payload;
    });

elastic / apm-agent-rum-js