openzipkin-contrib / brave-ratpack

Brave instrumentation for Ratpack
Apache License 2.0
5 stars 4 forks source link

Use client interceptors added in Ratpack 1.6 #29

Closed llinder closed 5 years ago

llinder commented 5 years ago

Looks like Travis doesn't appreciate this test https://github.com/openzipkin-contrib/brave-ratpack/blob/master/src/test/java/brave/http/ITZipkinHttpClientImpl.java#L105 . Locally all is fine but I guess Travis CPU restrictions is causing a deadlock :(

llinder commented 5 years ago

good work!

I think by default the IT doesnt expect implicit redirect handling. i guess we can add a test here for that or a TODO

Testing for implicit redirect handling would be useful. I could overwrite that one test to do this as well I suppose. Still I feel like there isn't much in the default integration tests that is getting in the way. Mainly I'm struggling with making the Ratpack HTTP Client Interceptor APIs work, mostly because there isn't a way to pass state between the request and response interceptor without putting the state in the Execution context. While using the Execution context should be fine, it doesn't behave as expected with the test harness setup. During tests the Execution context is being shared between requests which to my understanding should not be the case.

I think for now my game plan is to test this in a real application to see how it behaves. If a simple real world app doesn't exhibit anything weird, then I can move on to sorting out the Travis CI difficulties :/

codefromthecrypt commented 5 years ago

Testing for implicit redirect handling would be useful. I could overwrite that one test to do this as well I suppose. Still I feel like there isn't much in the default integration tests that is getting in the way. Mainly I'm struggling with making the Ratpack HTTP Client Interceptor APIs work, mostly because there isn't a way to pass state between the request and response interceptor without putting the state in the Execution context. While using the Execution context should be fine, it doesn't behave as expected with the test harness setup. During tests the Execution context is being shared between requests which to my understanding should not be the case.

i see, so Execution context isn't request scoped. yeah that could be a problem. A bounded weak hash map might be an ok alternative, if there is no other bucket to use.

I think for now my game plan is to test this in a real application to see how it behaves. If a simple real world app doesn't exhibit anything weird, then I can move on to sorting out the Travis CI difficulties :/

good idea to conserve energies! Thanks for the thorough reply!

drmaas commented 5 years ago

While using the Execution context should be fine, it doesn't behave as expected with the test harness setup

My understanding is that an Execution is not shared. They end up in a queue and get executed on various even loop threads.

In the ExecHarness class - it provides a single execution (by creating a new event loop and forking an execution on it) on which to perform activities.

I am interested to know what exactly happens to the Execution variables between request reception and response processing, and why things aren't working as expected in ExecHarness.

llinder commented 5 years ago

My understanding is that an Execution is not shared. They end up in a queue and get executed on various even loop threads.

This ended up being correct, thanks for nudging me in the right direction! I think with all of the ExecHarness issues I was confusing integration test issues and completely different runtime issues.

In the ExecHarness class - it provides a single execution (by creating a new event loop and forking an execution on it) on which to perform activities.

This statement does seem to match my experience in that every call to harness.yield was producing a shared execution for every HttpClient request. In the end I worked around it by putting any ClientSpanHolders back into the registry. This isn't ideal but without it the tests would fail.

I am interested to know what exactly happens to the Execution variables between request reception and response processing, and why things aren't working as expected in ExecHarness.

When multiple requests are issued during a single test, each call to the interceptor request adds a new Span to the registry. The behavior I see is that the previous Span is already in the registry when the second call happens. Then later when the interceptor receive method is invoked we get the Span, which may or may not be the last one, and remove it from the registry. Remove also happens to clear out all values queued up in the registry that match the type. This is a different nit pick entirely but there isn't really a way to only take a single value without taking out all values and inserting things back.

When I test this with a standalone test app the Execution is not shared which is what I would expect. Ironically though, redirects invoke the interceptor receive without an Execution entirely at runtime. End result is that redirects don't play well with the interceptor so I ended up having to just ignore them. Even trying to do something like annotate the Span on redirect didn't work since the interceptor receive method is always called before onRedirect so the Span is completed before the annotation can be applied.

And last but not least, the client interceptor receive method invocation is delayed. This is usually a couple milliseconds. In the screenshot the first Span is the server request/response. Second span is a location span around the client call. Third span is a remote http call generated by the client interceptor.

image

Here is another example, this time using a ParallelBatch to perform several calls in parallel.

image

kirenjolly commented 5 years ago

@llinder Is there any estimated time on when this PR will get merged and released, since we are facing a production issue due to context memory leak when using streaming requests. It will be highly appreciated if we could get this fix soon :)

llinder commented 5 years ago

@kirenjolly I think the only thing remaining is to pull the trigger and accept the differences between the interceptor approach vs. the older wrapped http client approach. After getting the interceptor approach working its clear that both have their own pros/cons. Reviving the wrapped client approach as a separate library for 1.6 and 1.7 is always another option so I don't see this MR as a one way street which reduces risk I suppose.

Since requestStream has been an issue for us as well I'm going to do a bit more testing with that. In theory this should resolve the memory leak but I would like to see it run for a while and monitor heap usage at least.