stackvana / hook.io

Open-Source Microservice Hosting Platform
https://hook.io
Other
1.27k stars 119 forks source link

Socket hangup on non-ASCII chars posted from IFTTT Maker #183

Open VerstandInvictus opened 8 years ago

VerstandInvictus commented 8 years ago

So, I suspect that this is not Hook's issue but Node's, or perhaps IFTTT's, but it seems from the data I've been able to find that Hook may be able to workaround it:

The specific issue is that when I POST a parameter string containing a non-ASCII character to Hook from IFTTT, I get either a socket hangup or a timeout in the IFTTT logs (unfortunately they don't give me tracebacks or any further data).

In testing manual entries (paste query URL into Firefox) I don't encounter socket hangups under the same circumstances as IFTTT (usually the squirrelly single and double quotes from Unicode - men’s) but I do encounter the same timeouts (ᕙ༼ຈل͜ຈ༽ᕗ or any subset thereof).

This seems relevant, but I wasn't sure I understood it, and in any event I can't control the actual headers on IFTTT:

http://stackoverflow.com/questions/18692580/node-js-post-causes-error-socket-hang-up-code-econnreset

What do you think? Really I just need to find a workaround for this, since the data flow causing problems is Pocket -> IFTTT -> Hook, and I don't control character makeup before the data gets into Hook. It seems like the data isn't getting into my Hook code at all - I've tried to log the raw request data via Hook code and nothing comes through, whereas it does when I do not use these characters.

Marak commented 8 years ago

Can you replicate this issue with a simple hook or curl request?

VerstandInvictus commented 8 years ago

I can replicate the timeout as above, but have not been able to replicate the socket hangup.

Marak commented 8 years ago

@VerstandInvictus -

Pushed a few updates. Can you try again?

If you can provide me an output of what the actual incoming HTTP is, it would make it possible for me to try and debug the issue. hook.io services should never return socket hangups. If you see a socket hangup, it means a worker or load-balancer crashed ( should never happen ).

I had seen a few random errors in the logs related to bad http headers, maybe that was it.

If you still can't get any information to log in the hook from IFFT, try setting up a simple node.js server somewhere as the tie-breaker. Have that server dump the entire incoming request ( headers / body ), and post the data here.

I will look into it if you can provide more information.

VerstandInvictus commented 8 years ago

So, I did recently move the problematic use case to a self hosted Python/Flask application I wrote based on the hook code I was using - because you're right, it was impossible to get the outgoing HTTP request data from IFTTT any other way. I actually bypassed IFTTT entirely for that

After doing that and debugging it, I ultimately believe that IFTTT is checking only one of the two possible properties that Pocket uses for a title, and that if it chokes on that it fills a blank property in my request (which is URL only), which is doing something bizarre somewhere. After I rewrote the self hosted app to use either title (or the URL if both are blank, which is possible for some reason), I haven't had any trouble.

I'm still running a couple other things on Hook and they're doing great, no problems. I'll let you know if I see anything else like this.


On Feb 28, 2016 4:41 PM, "Marak" notifications@github.com wrote:

@VerstandInvictus https://github.com/VerstandInvictus -

Pushed a few updates. Can you try again?

If you can provide me an output of what the actual incoming HTTP is, it would make it possible for me to try and debug the issue. hook.io services should never return socket hangups. If you see a socket hangup, it means a worker or load-balancer crashed ( should never happen ).

I had seen a few random errors in the logs related to bad http headers, maybe that was it.

If you still can't get any information to log in the hook from IFFT, try setting up a simple node.js server somewhere as the tie-breaker. Have that server dump the entire incoming request ( headers / body ), and post the data here.

I will look into it if you can provide more information.

— Reply to this email directly or view it on GitHub https://github.com/bigcompany/hook.io/issues/183#issuecomment-189981090.

Marak commented 8 years ago

Okay.

Would it be possible to provide me with the problematic request? Perhaps a dump of the headers and body?

It's important that hook.io can receive data from any source, regardless of the format of the data being sent to us. I'd like to reproduce the issue and resolve it.

We may have already resolved the issue related to malformed headers.

Glad to hear you are liking the service. We just pushed a big release with role based access control and API keys. I'll be sending out a blast email later, but here is the blog post: https://hook.io/blog/role-based-access-control

Doing the final rounds of testing right now.

VerstandInvictus commented 8 years ago

I actually haven't been using IFTTT for this app, so I don't have them handy, but I can try to work something up to dump them this week - will let you know.


On Feb 28, 2016 5:00 PM, "Marak" notifications@github.com wrote:

Okay.

Would it be possible to provide me with the problematic request? Perhaps a dump of the headers and body?

It's important that hook.io can receive data from any source, regardless of the format of the data being sent to us. I'd like to reproduce the issue and reolve it.

We may have already resolved the issue related to malformed headers.

Glad to hear you are liking the service. We just pushed a big release with role based access control and API keys. I'll be sending out a blast email later, but here is the blog post: https://hook.io/blog/role-based-access-control

Doing the final rounds of testing right now.

— Reply to this email directly or view it on GitHub https://github.com/bigcompany/hook.io/issues/183#issuecomment-189983642.

VerstandInvictus commented 8 years ago

So, I pointed the IFTTT action at my own nginx server and ran tcpdump, here's a few of its POSTs. There was no body set in IFTTT, so body should have been empty, which it was.

04:32:35.913345 IP ec2-54-91-112-58.compute-1.amazonaws.com.51678 > 10.0.0.7.http: Flags [P.], seq 1501847819:1501848767, ack 3030167473, win 58, options [nop,nop,TS val 116131387 ecr 565914220], length 948
E...g0@...ED6[p:
......PY.a........:9......
...;!.*lPOST /test.html?sec=foo&par=47580&st=null&msg=_B_R_K_%20Is%20there%20anyway%20to%20print%20RAW%20http%20requests%20in%20Flask/WSGI/any%20python%20web%20framework?%20_B_R_K_%20http://stackoverflow.com/questions/25466904/is-there-anyway-to-print-raw-http-requests-in-flask-wsgi-any-python-web-framewor%20_B_R_K_%20Excerpt_I%20know%20flask%20is%20based%20on%20WSGI.%20Is%20there%20anyway%20to%20get%20this%20to%20work%20with%20flask?%20%20%20This%20defines%20a%20piece%20of%20middleware%20to%20wrap%20your%20Flask%20application%20in.%20The%20advantage%20is%20that%20it%20operates%20entirely%20independent%20of%20Flask,%20giving%20you%20unfiltered%20insight%20into%20what%20goes%20in%20and%20what%20comes%20out. HTTP/1.1
Content-type: text/plain
host: stlvg.duckdns.org
content-length: 0
x-newrelic-id: XAMGV15QGwQJVllRDgQ=
x-newrelic-transaction: PxRVAwVUAQoFVwdUAQAEVkYdUFIOFQZOElMLAFsBA1NRUQFWBFVRVwcUG0MCVwtWAwJTBhVs
Connection: close

04:47:30.645987 IP ec2-54-227-129-156.compute-1.amazonaws.com.44421 > 10.0.0.7.http: Flags [P.], seq 862547589:862548541, ack 3194376463, win 58, options [nop,nop,TS val 25686192 ecr 566137905], length 952
E....!@....d6...
......P3in..fQ....:.6.....
....!..1POST /test.html?sec=foo&par=47580&st=null&msg=_B_R_K_%20Is%20there%20anyway%20to%20print%20RAW%20http%20requests%20in%20Flask/WSGI/any%20python%20web%20framework?%20_B_R_K_%20http://stackoverflow.com/questions/25466904/is-there-anyway-to-print-raw-http-requests-in-flask-wsgi-any-python-web-framewor%20_B_R_K_%20Excerpt_I%20know%20flask%20is%20based%20on%20WSGI.%20Is%20there%20anyway%20to%20get%20this%20to%20work%20with%20flask?%20%20%20This%20defines%20a%20piece%20of%20middleware%20to%20wrap%20your%20Flask%20application%20in.%20The%20advantage%20is%20that%20it%20operates%20entirely%20independent%20of%20Flask,%20giving%20you%20unfiltered%20insight%20into%20what%20goes%20in%20and%20what%20comes%20out. HTTP/1.1
Content-type: text/plain
host: stlvg.duckdns.org
content-length: 0
x-newrelic-id: XAMGV15QGwQJVllRDgQ=
x-newrelic-transaction: PxQGUFNSDQUHU1cAVFQAUVQTGlUDChAHHEAMAAFcVgVQUgADV1QGWgYBFU1EBwwAVVcHUFMTag==
Connection: close

05:32:45.487420 IP ec2-184-72-145-114.compute-1.amazonaws.com.40663 > 10.0.0.7.http: Flags [P.], seq 699864551:699865253, ack 2937712217, win 58, options [nop,nop,TS val 32334238 ecr 566816614], length 702
E...tc@......H.r
......P)......Y...:.......
..a.!..fPOST /test.html?sec=foo&par=47580&st=null&msg=_B_R_K_%20Use%20TCPDUMP%20to%20Monitor%20HTTP%20Traffic%20_B_R_K_%20https://sites.google.com/site/jimmyxu101/testing/use-tcpdump-to-monitor-http-traffic%20_B_R_K_%20Excerpt_1.%20To%20monitor%20HTTP%20traffic%20including%20request%20and%20response%20headers%20and%20message%20body:tcpdump%20-A%20-s%200%20%27tcp%20port%2080%20and%20(((ip[2:2]%20-%20((ip[0]&0xf)%3C%3C2))%20-%20((tcp[12]&0xf0)%3E%3E2))%20!=%200)%272. HTTP/1.1
Content-type: text/plain
host: stlvg.duckdns.org
content-length: 0
x-newrelic-id: XAMGV15QGwQJVllRDgQ=
x-newrelic-transaction: PxQCVV4BWgdVAQBbBQcPUkYdUFIOFQZOElcMWlpaAFdWBABTB1kHQEgUUQMDW1kEVQZDPw==
Connection: close

06:32:39.732438 IP ec2-184-72-145-114.compute-1.amazonaws.com.40515 > 10.0.0.7.http: Flags [P.], seq 2890173391:2890174110, ack 2521933848, win 58, options [nop,nop,TS val 33232799 ecr 567715179], length 719
E....D@...m..H.r
....C.P.D...Q.....:d......
....!..kPOST /test.html?sec=foo&par=47580&st=null&msg=_B_R_K_%20Installation%20_B_R_K_%20http://mitmproxy.org/doc/install.html%20_B_R_K_%20Excerpt_The%20preferred%20way%20to%20install%20mitmproxy%20is%20to%20use%20pip.%20A%20single%20command%20will%20install%20the%20latest%20release%20of%20mitmproxy,%20along%20with%20all%20its%20dependencies:%20%20This%20procedure%20may%20vary%20if,%20for%20instance,%20you%27ve%20installed%20Python%20from%20an%20external%20source%20like%20homebrew. HTTP/1.1
Content-type: text/plain
host: stlvg.duckdns.org
content-length: 0
x-newrelic-id: XAMGV15QGwQJVllRDgQ=
x-newrelic-transaction: PxRRBQdRXgcIBgBQAQgHVkYdUFIOFQZOElUIUQANBANSXF1RVVRVAVAUG0MCVwtWAwJTBhVs
Connection: close
VerstandInvictus commented 8 years ago

Command was tcpdump -A -s 0 'tcp port 80 and (((ip[2:2] - ((ip[0]&0xf)<<2)) - ((tcp[12]&0xf0)>>2)) != 0)'

VerstandInvictus commented 8 years ago

And here's a normal header from another webapp I'm running against the same server:

15:36:34.299410 IP c-73-25-89-49.hsd1.or.comcast.net.58765 > 10.0.0.7.http: Flags [P.], seq 257900422:257900890, ack 3822763501, win 16425, length 468
E...8.@.q.#.I.Y1
......P._?.....P.@)V...GET /scripts/leadvsgold.js HTTP/1.1
Host: stlvg.duckdns.org
Connection: keep-alive
Pragma: no-cache
Cache-Control: no-cache
Authorization: Basic c3VudGlnZXI6bWluZGtpbGxlcg==
Accept: */*
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/48.0.2564.116 Safari/537.36
Referer: http://stlvg.duckdns.org/
Accept-Encoding: gzip, deflate, sdch
Accept-Language: en-US,en;q=0.8,es;q=0.6,af;q=0.4
Cookie: resolution=1920