DataDog / dd-sdk-android

Datadog SDK for Android (Compatible with Kotlin and Java)
Apache License 2.0
140 stars 56 forks source link

What's the preferred `DataDogInterceptor` sampling rate? #2082

Open scana opened 3 weeks ago

scana commented 3 weeks ago

Question

Hi!

We've been discussing with our team what should be the preferred sampling rate for DataDogInterceptor.

We do track all of our sessions in RUM (internal const val DEFAULT_SAMPLE_RATE: Float = 100f) but noticed that the interceptor itself defaults to only 20% (internal const val DEFAULT_TRACE_SAMPLE_RATE: Float = 20f) - we currently keep it at 1% (someone set that this way without any specific reason).

Hence I wanted to ask:

MaelNamNam commented 3 weeks ago

Hi @scana 👋

is there anything to consider client-performance-wise before setting this to 100%?

The impact on the performance of tracking more or less network requests is negligible and not noticeable in most cases, unless your app is sending billions of network requests per minute :) However, the more things you track, the more data our SDK uploads to Datadog's backend, and this is the inherent tradeoff of observability. So we usually recommend to keep this in mind when instrumenting your application, and we don't have a "one size fits all" answer to provide I'm afraid. We're working on our end on producing up-to-date benchmarks of the SDK footprint, but last time we computed some: uploads were around 73 KB and downloads were around 5 KB for an average 2:10 session with some browsing, taps, swipes, opening and closing views, the app sending some network requests…

are there any cost implications of tracking more requests?

Not on the RUM side, as the billing unit is the Session, regardless of whether a Session contains 1 View only, or whether it contains an insane volume of Views, Resources, Errors, Actions... However, if you use distributed tracing, the sampling decision taken client-side by the SDK will override the ones defined in your backend services monitored with APM, which can result in a surge of spans being ingested, and therefore a surge of costs. We are working on some initiatives on this front to give you more control over client-side sampling decisions, but these are very complex projects and will take a few months if not a few quarters to be released.

does it make sense to track all of the requests?

In most cases, no, it doesn't. But we do see customers tracking all of them in some cases:

Hope this helps :) Let me know if I answered your questions and I'll close the issue