Open ayrat555 opened 4 years ago
This change would ultimately end up breaking any existing setups, so we probably want to make it a configuration that defaults to false. Anyone who wants to tackle this is welcome.
@zachdaniel it's actually a safe, non breaking change to make the telemetry listeners dynamic. but it doesn't solve the problem that all phx installs broadcast on the same keys, and so doing two installs on those would be bad. (in other words, fixing the endpoint listener to be dynamic would just kick the can to the next line, which is the router listeners)
we'd have to change the api for installing prefixes to have a separate, optional installation for adding extra endpoints listeners (without also adding more phx listeners) :/ suboptimal, but it's just the fallout of allowing users to have dynamic telemetry event names at the Plug.Telemetry level (problematic design leads to problems, sort of thing)
Hmm... so I get what you're saying about making the telemetry listeners dynamic.
As for the router, you're saying there isn't a way to add an event prefix to the router events?
so the problem is:
Plug.Telemetry, prefix: [:my_app, :my_endpoint]
)thus if someone installs twice, once for two different endpoints, the start/stop traces will behave normally (that's done in endpoint events), but spans will get doubled up - we'll start a span and immediately start the same span, because we installed 2 listeners on the same router events. (spans are handled in the router).
So this sounds essentially like phoenix doesn't support telemetry consumers for multiple routers then...
So this sounds essentially like phoenix doesn't support telemetry consumers for multiple routers then...
Yeah, ideally endpoints and routers always emit the exact same events wrt name, and identify themselves via a metadata field. I think we can get the router by introspecting on the conn though, so it should be doable.
FYI this is something on my to-do list, as we similarly have 2 endpoints running on a monolith that therefore use the same spandex server. Hopefully have a proof of concept by EOM
Some brainstorming, I think this'll warrant a new function, SpandexPhoenix.Telemetry.install_many/1
, where you can essentially pass options you'd pass to install()
multiple times, once for each endpoint you want traced.
I don't think making it a new option in install/1
is viable because people with multiple endpoints will doubtless want to make each its own service or give some a different tracer than others. That requires bundling together not only the endpoint's prefix but also all the other options...
The only unknown i have on installing different tracers for different endpoints is tracking down which endpoint each phx router dispatch event is being called for, and using the corresponding arguments. Hopefully there's something in the event's metadata i can do introspection on to match things up. If not, I'll be a bit flummoxed.
I came across the same issue today. I run multiple endpoints in the same app and would like to tag the span with a different service for each of them. My current thinking to work around the issue is to create a plug which assigns the desired service name to the Plug.Conn.private field and then use the :customize_metadata callback to read that value and set the service accordingly. Any thoughts?
So catching up on this thread, it looks like there are two problems:
You can't actually call SpandexPhoenix.Telemetry.install/1
more than once, even with different options because this line and this line are
registering their handlers using static values for the handler IDs. This should be relatively easy to solve because we could just use the event_prefix
as part of the handler ID so that they don't collide.
The remaining problem if you do step 1 is that then we'd end up registering the same [:phoenix, :router_dispatch, *]
events to both handlers, so they would both receive all of the events. I just spun up a test project and confirmed that Phoenix does insert the Endpoint in the conn.private.phoenix_endpoint
, in both the Plug.Telemetry
and :router_dispatch
events, so it might be possible to combine that somehow with config options to determine which one should go with which. I can sort of see a path forward here, but to be honest, I don't think I personally have the energy to pursue it, because at work we don't even use spandex_phoenix
anymore, and we just have some internal Telemetry handlers that accomplish the same thing, giving us full control over how we want to handle each event and call Spandex ourselves. I realize that this is not helpful to the community and I'd like to work on an open-source solution that will be more helpful, but that's my current situation right now. 😅
I think what you're describing @mruoss could work as long as you work around issue 1 above. I think you could do that without any changes in Spandex if it does solve your problem. It probably wouldn't be too hard, it's just more work than you should need to do if Spandex could do it somehow for you. That will get you a solution more quickly, but if someone has the time and energy to come up with a PR that enables this, I will be happy to help test it and get it released for others to use.
Hi @GregMefford
I have implemented my approach and it works. Issue 1 above is not really an issue in my case as I'm calling SpandexPhoenix.Telemetry.install/1
only once as follows:
SpandexPhoenix.Telemetry.install(
customize_metadata: fn conn ->
service = Map.get(conn.private, :service, :elixir)
conn
|> SpandexPhoenix.default_metadata()
|> Keyword.put(:service, service)
end
)
And then as described above, use a plug in the endpoint declaration to set the service using Plug.Conn.put_private/3
.
So this is a working workaround, but as you say, I guess the library should handle it out of the box somehow. But I don't see that solution either at this moment.
I have two phoenix 1.5 sub-projects in the umbrella application. I added
Plug.Telemetry
into both of Endpoints.Then I added
SpandexPhoenix.Telemetry.install
to both of these applicationsOne of them always fails with match error
{:error, :already_exists}
. It's an error from telemetry - https://hexdocs.pm/telemetry/telemetry.html which means the handler_id is already taken.I suggest appending endpoint prefix to both
spandex-router-telemetry
andspandex-endpoint-telemetry
handler ids - https://github.com/spandex-project/spandex_phoenix/blob/master/lib/spandex_phoenix/telemetry.ex#L95.