Closed trask closed 4 years ago
@trask curious, what application/dependencies are you loading when you test this? Would be good to make it a repeatable test.
Oh yes, I should have included that! I'm using Spring PetClinic for testing, and the benchmarking harness/scripts for this test are at https://github.com/trask/agent-benchmarking/tree/master/coldstart.
oh awesome, thanks!
I just tried PetClinic with all instrumentation enabled and a real exporter and got a 13s startup time on my Mac which is also running an IDE and another application under test.
Is this still an issue or should it be closed?
@trask should retest and see how it works in his "standard" environment.
The startup is much slower on single core cloud machines. But this may just be a concern for Azure (and other cloud providers), where cold start time is a super important metric, and adding even 10 seconds to cold start overhead is frowned upon.
OK. For reference, it took 5.5s without instrumentation, so there's definitely a measurable added delay.
In my previous observations, I found that instrumentation of Spring applications is particularly time consuming. For SpecialAgent, we approached resolving this with Static Deferred Attach. If I remember correctly, with the initial use-case that engaged us to develop this solution, we saw a reduction of the startup time of the respective Spring Boot application from 40s to 5s.
I wanted to report that there have been major improvements to startup overhead thanks to DataDog efforts in this area. I'll re-run and post new startup benchmarks soon.
Latest startup overhead in the single core cloud machine test was 13.5 seconds. That's a very good improvement, and I think justifies closing this initial tracking issue. I'll open another issue at some point to track further progress.
Do we have more improments recently?
Current startup overhead on Azure App Service P1V2 instances (single core) is ~40 seconds.
Initial goal is to get this under 10 seconds in this environment.
And then we can open new issues for subsequent goals. We (Azure) eventually need to get this way under 10 seconds.