elastic / apm-server

https://www.elastic.co/guide/en/apm/guide/current/index.html
Other
1.2k stars 516 forks source link

Provide an option to configure the url pattern for RUM data #3868

Open hmdhk opened 4 years ago

hmdhk commented 4 years ago

To address the high url cardinality of RUM data we decided as a first step to provide a configuration for the users to provide a url pattern that will help us group transactions based on the page url.

This config could accept a regex or another syntax for defining the url pattern.

Furthermore, this option could be fetched from central config. This has a few advantages, for example it lets both Kibana and potentially the RUM agent to consider this pattern on grouping and sampling of transactions. As well as providing easy and scalable configuration for the users

For more context see this comment

cc @axw

axw commented 4 years ago

Thanks for opening this @jahtalab

We also discussed the possibility of using ML categorisation. I think these would work well together: automatic categorisation would produce regular expressions that the APM Server could periodically fetch, to apply to future transactions. Users could manually override these with their own rules, defined in the UI, for more control.

axw commented 3 years ago

@jahtalab @vigneshshanmugam given that we have https://github.com/elastic/apm-agent-rum-js/issues/56, do you still think this enhancement is worthwhile? Do we have any data on how well the heuristic-based approach is working?

vigneshshanmugam commented 3 years ago

@axw Cant say for now if our current approach is performing well enough, we have deployed the new version of the agent on our website. Would like to leave it for a couple of days to analyse the data points and can confirm the same. WDYT?

axw commented 3 years ago

Sure, there's no rush I'm just doing some spring cleaning :)

sorenlouv commented 3 years ago

We are currently running into this issue with data from elastic.co:

image

While it's possible to increase the limit, this will only mask the problem. A better solution would be to accurately group transactions that are related (or rather: instances of the same view). Having an optimal grouping of transactions makes it much easier to compare transactions to each other within a group, and understand when and why a particular transaction is slow.

A quick cardinality count shows that there are 1905 unique transaction groups within the past week. That seems quite high. A look into the groups show they are duplicated for each language. Take for instance the subscriptions page:

image

There's already some custom grouping happening in the RUM agent but it's clearly not a simple task. Would APM Server be in a better position to group related transactions?

jalvz commented 3 years ago

I'm inclined to say that we could do something about it. We have the infrastructure already to aggregate transactions for histogram metrics, so we would "just" have to plug in the extra bits to aggregate by a new criteria; and we probably can afford the extra computational cycles that RUM can't.

Not to say that it would be easy or we should commit to it, but I certainly would be happy to have a look.

axw commented 3 years ago

Agreed, we certainly could do something about it in the server – but it could be a bit dangerous.

Breakdown metrics, which are calculated by agents, contain the transaction name observed by the agent. Similarly, if we ever want to push the transaction duration histogram aggregation down to the agents (for higher accuracy), then we would need to aggregate on the transaction name.

That's not to say it's impossible, it's just something we'll need to be cautious about to ensure we don't end up with inconsistencies. The server would need to apply the renaming to both trace events and metrics, and potentially combine the aggregated metrics.

simitt commented 3 years ago

Could we introduce something like a transaction.group_name that by default is set to the transaction.name but can be used to group multiple similar transaction.names together, instead of changing the transaction.name directly to not lose information? The UI would then need to use the new field and potentially fall back to transaction name if not available.