yammer / breakerbox

Frontend for Tenacity + Archaius
Apache License 2.0
63 stars 29 forks source link

Transfer-encoding in breakerbox dashboard (ProxyStreamServlet) #44

Open psundar-bill opened 7 years ago

psundar-bill commented 7 years ago

In ProxyStreamServlet.java, is it right to add the Transfer-encoding header also ? I get only ping but not data from the hystrics metrics stream. I doubt if this is the cause.

chrisgray commented 7 years ago

It should set the Transfer-Encoding: chunked header. Do you have breakerbox correctly configured to be able to retrieve metrics from your instances? Are you using YamlInstanceDiscovery, one of the other supported discovery methods, or something you've written yourself?

psundar-bill commented 7 years ago

Thanks for the quick reply. I have been struggling for some time on this issue. In comparison, I found that this adding of Transfer-encoding header is removed in ProxyStreamServlet.java in hystrix-dashboard. But this is retained in breakerbox-dashboard. Can you please explain me the difference ?

NOTE : My problem is that I get data from the dashboard while it is running locally but not in the office network. The other behavior is that even in the office network, I am able to get the hystrix metrics stream of my application but not through dashboard. So I probed more. It would be help me debug if you could explain me regarding the difference in the behavior.

chrisgray commented 7 years ago

It looks like they are removing it as you can see here PR 671. I can look at pulling that in or depending on the ProxyStreamServlet directly. I remember we had to make some changes to have it work seamlessly with the Breakerbox dashboard. Let me know if the Transfer-Encoding issue is something you're running into. It looks like they were able to see an error thrown from curl if that is something you're encountering.

psundar-bill commented 7 years ago

I observe that I get the data but when I look in wireshark, the '\r\n' of every chunk comes in a different tcp packet than the data. This makes the chunk to not display as it always seems incomplete. Only when the dashboard server is closed, then all the data is displayed.

psundar-bill commented 7 years ago

I think that https://bugs.eclipse.org/bugs/show_bug.cgi?id=442479 this issue is causing the problem. When I use CURL 7.43.0, it just hangs. When the server terminates in --raw option of curl, it displays all the chunks. Could you please check ?

chrisgray commented 7 years ago

Interesting a lot of it does point to the chunked-encoding being troublesome. We are running Jetty 9.4.x in the latest of breakerbox and this bug targets 9.2.x so it's pretty old and most likely outdated. However, it might be best to simply disable chunked-encoding like they did in the ProxyStreamServlet and see if that fixes things. I'll get that merged in today. Could you try out a snapshot release and see if that fixes your issue?

psundar-bill commented 7 years ago

Thanks, that did not fix. Even now, Jetty 9.4.x sends the CRLF of the earlier chunk with the next chunk or after a certain delay. Hence, when responses are sent in such a manner, on localhost it works but not on the network. May be this gives you a hint to suggest me a workaround.

Thanks for your help.

chrisgray commented 7 years ago

Yeah it might make sense to simply just disabled chunked-encoding. Let me see if I can get this commit in right now into a branch

psundar-bill commented 7 years ago

Thanks a lot. If it is done like ProxyStreamServlet to not add Transfer-encoding : chunked, the servlet by default adds chunked if Content-length is not specified.

chrisgray commented 7 years ago

Can you try

https://oss.sonatype.org/content/repositories/snapshots/com/yammer/breakerbox/breakerbox-service/0.6.5-SNAPSHOT/breakerbox-service-0.6.5-20170926.204909-10.jar

Or building the latest of no_chunked_encoding

psundar-bill commented 7 years ago

I have tried this already but Transfer-encoding: chunked gets added as that is the default if Content-length is not specified.

Can we not do http streaming without chunked ?

chrisgray commented 7 years ago

Pretty much... you either need a Content-Length or Transfer-Encoding: chunked for it to be a valid response I believe.

In your case though it's odd that it's working for you locally, but not on the network where you've encountering the issues. Is there some proxy/reverse-proxy that's in between you and breakerbox?

psundar-bill commented 7 years ago

No there is not proxy/reverse proxy. I just tried using Content-length and then I am able to get the first streamed data till the first '\n'. So problem is that I am getting the CRLF delayed.

Any workaround please let me know. Is it easy to change the container(jetty) and try ?

chrisgray commented 7 years ago

What's the curl command you are using? Also are you able to replicate this behavior locally?

psundar-bill commented 7 years ago

curl -iv --raw http://:8080/tenacity/proxy.stream?origin=localhost:8080/turbine.stream?cluster=breakerbox I see the issue of CRLF in the following chunk is observed locally and over the network. I tried the Content-Length solution over the network.

chrisgray commented 7 years ago

I just tried that locally with latest master breakerbox and was unable to reproduce. I'm running curl at 7.55.1 on macOS Sierra

psundar-bill commented 7 years ago

Can I know what are you trying to reproduce ? Moreover is the dashboard and the client curl running locally ?

chrisgray commented 7 years ago

Yes I have the dashboard and curl command running locally and I see the text/event-stream come through with no issues.

psundar-bill commented 7 years ago

It is the same for me too. Only through the network there is a problem because of the delay in CRLF. If you look at the capture in wireshark, if I follow the TCP stream, i would see the chunks but not when I follow the HTTP stream. Also in wireshark, it can be seen that CRLF for the earlier chunk comes in the next chunk. Please let me know if you dont see the same behavior.

chrisgray commented 7 years ago

When you mention delay, how long is this delay? Is this something that might be solved with explicitly turning on TcpNoDelay? For example is it actually causing an issue or is just making the dashboard appear jittery

psundar-bill commented 7 years ago

Curl or browser(Chrome/Firefox/Safari) all waiting for response with no response displayed. Maybe I shouldnt have called delay but CRLF of the earlier chunk comes with the next chunk. But the next chunk's CRLF is in the following chunk. So, the client (curl) assumes there is more response as a chunk is always incomplete and waiting.

chrisgray commented 7 years ago

Hmmm. Interesting, at this point though I'm a bit of a loss since I am unable to reproduce the issue locally or remotely. Is there anything else you can think of that would be in the way between you and wherever you have breakerbox running on your network that would cause this delay?