Closed zachdaniel closed 7 years ago
I started working on this one, thinking between 2/3 solutions:
Spandex.Datadog.Api.create_trace()
to gen server and async execution via cast
which aggregates spans in state, and periodically flushes them via
Api.create_trace()` Api.create_trace()
What do you think?
I've done some simple benchmark, where raw method takes around ~0.9s in order to simulate work. Same execution with tracing is almost 10x slower at the moment (which we can also see when load testing our product api)
defmodule TestModule.WithTrace do
use Spandex.TraceDecorator
@decorate trace()
def call do
process()
process()
process()
end
@decorate span()
def process() do
:timer.sleep(50)
fetch()
fetch()
:timer.sleep(50)
fetch()
fetch()
end
@decorate span()
defp fetch() do
:timer.sleep(50)
end
end
defmodule TestModule.WithoutTrace do
def call do
process()
process()
process()
end
def process() do
:timer.sleep(50)
fetch()
fetch()
:timer.sleep(50)
fetch()
fetch()
end
defp fetch() do
:timer.sleep(50)
end
end
Benchee.run(%{
"tracing: OFF" => fn -> TestModule.WithoutTrace.call() end,
"tracing: ON " => fn -> TestModule.WithTrace.call() end,
})
Name ips average deviation median
tracing: OFF 1.09 0.92 s ±0.00% 0.92 s
tracing: ON 0.112 8.96 s ±0.00% 8.96 s
Comparison:
tracing: OFF 1.09
tracing: ON 0.112 - 9.76x slower
Currently traces are publish synchronously by the process doing the tracing, which is definitely not scalable.