elastic / apm-data

apm-data holds definitions and code for manipulating Elastic APM data
Apache License 2.0
12 stars 25 forks source link

Better error grouping key for opentelemetry SDKs #299

Open lahsivjar opened 2 months ago

lahsivjar commented 2 months ago

Current logic for error grouping key prioritizes exception.type, log.param_message (this is non formatted error string), and the stacktrace. The final fallback is to use log.message which is formatted error string. This logic ensures that we contain the cardinality of the grouping key, however, with OTel SDKs we don't get stacktrace and log.param_message. This makes the grouping key rely only on exception.type taking away a lot of the usability of the grouping.

One possible solution might be to use the log.message if we can't get log.param_message OR the stacktrace. We can contain the cardinality explosion by a locality sensitive hashing techniques or heuristics (based on tokenization of the message and removing fields that are parameterized for example URIs or numbers).

NOTE: Current evidences are collected ONLY using opentelemetry-go and we may want to check some other language SDKs for a holistic view.