spandex-project / spandex_phoenix

Phoenix Instrumentation tracer
MIT License
82 stars 29 forks source link

fix: filter traces #35

Closed novaugust closed 4 years ago

novaugust commented 4 years ago

standby as i verify this branch

novaugust commented 4 years ago

This definitely fixes the filter_traces bit. Can't know about the errors part until i get another error ^.^

zachdaniel commented 4 years ago

just realized, in the error handling code and trace finishing code we should check the filter_traces to make sure that we don't close a trace we didn't start.

novaugust commented 4 years ago

augh right, because finish_trace is called in two different spots >< thanks will add.

zachdaniel commented 4 years ago

We should add that change in finish_trace as well.

novaugust commented 4 years ago

finding the problem w/ error reporting. the conn isn't included in metadata at that point. @zachdaniel is using current_trace_id() a good way to find out if we're in a trace or not, and thus whether we should close it?

zachdaniel commented 4 years ago

Did we ever find out of the stop and exception messages can both happen?

zachdaniel commented 4 years ago

So current_trace_id() tells you that you are in a trace, but not necessarily if you are in the trace that you think you're in. Seems strange that there would be any point in the endpoint where the conn isn't available. Isn't that essentially the one constant data structure?

novaugust commented 4 years ago

Seems strange that there would be any point in the endpoint where the conn isn't available. Isn't that essentially the one constant data structure?

You'd think right? And yet. No conn in metadata in phx exception. Docs specify the following:

%{kind: :throw | :error | :exit, reason: term(), stacktrace: Exception.stacktrace()}

but obviously other things are in there as well, or else my phx_controller? bit wouldn't work. Here's my stacktrace that tells me there's no conn:

Handler "spandex-router-telemetry" has failed and has been detached. 
Class=:error Reason={:badkey, :conn, %{error: ..., ....}

(stacktrace is included in error above, and cites the meta.conn call)

novaugust commented 4 years ago

huh. source code in phx is:

https://github.com/phoenixframework/phoenix/blob/2fdf7a9d7e0d5504cb1115853ef26b06e45a4baf/lib/phoenix/router.ex#L359-L371

which shows that phx_action? shouldn't even work. now i'm confused how i'm getting to where i am

novaugust commented 4 years ago

@zachdaniel re: phx telemetry router events. is it okay to call start_span/stop_span outside of a trace?

novaugust commented 4 years ago

I'm still not seeing any error traces get reported, but have no errors in logs about telemetry disconnecting. Not sure what the story is. Might be worth shipping this for the fix on filter_traces

zachdaniel commented 4 years ago

Changes released as 1.0.3. We should really fix the error reporting though (or confirm that it works), so we can consider this nicely tied up :)

zachdaniel commented 4 years ago

🚀 Thank you for your contribution! 🚀

novaugust commented 4 years ago

@zachdaniel for sure, i really want it working xD I'll make an issue and keep evaluating. maybe we'll get lucky and someone else will see the problem (cough @keathley cough)