mesosphere / marathon

Deploy and manage containers (including Docker) on top of Apache Mesos at scale.
https://mesosphere.github.io/marathon/
Apache License 2.0
4.07k stars 845 forks source link

Proxying with "Expect: 100-continue" does not work correctly #1763

Closed sttts closed 9 years ago

sttts commented 9 years ago

Workarounds

I deploy a redis cluster. srv005 is marathon leader. Then:

$ curl -v -f -H "Content-Type: application/json" -X POST http://srv004:5080/v2/groups -d@redis.json
* Hostname was NOT found in DNS cache
*   Trying 10.0.0.4...
* Connected to srv004 (10.0.0.4) port 5080 (#0)
> POST /v2/groups HTTP/1.1
> User-Agent: curl/7.37.1
> Host: srv004:5080
> Accept: */*
> Content-Type: application/json
> Content-Length: 2009
> Expect: 100-continue
> 
< HTTP/1.1 100 Continue
< HTTP/1.1 200 OK
< Content-Length: 0
* Server Jetty(8.y.z-SNAPSHOT) is not blacklisted
< Server: Jetty(8.y.z-SNAPSHOT)
< 
* Connection #0 to host srv004 left intact

But, talking to srv005 directly gives http 400:

$ curl -v -f -H "Content-Type: application/json" -X POST http://srv005:5080/v2/groups -d@redis.json
* Hostname was NOT found in DNS cache
*   Trying 10.0.0.5...
* Connected to srv005 (10.0.0.5) port 5080 (#0)
> POST /v2/groups HTTP/1.1
> User-Agent: curl/7.37.1
> Host: srv005:5080
> Accept: */*
> Content-Type: application/json
> Content-Length: 2009
> Expect: 100-continue
> 
< HTTP/1.1 100 Continue
* The requested URL returned error: 400 Bad Request
* Closing connection 0
curl: (22) The requested URL returned error: 400 Bad Request

In the logs of srv004:

Jul  8 09:52:00 srv004 marathon[6896]: [2015-07-08 09:52:00,372] INFO Proxying request to GET http://srv005:5080/v2/apps from srv004:5080 (mesosphere.marathon.api.JavaUrlConnectionRequestForwarder$:2
80)
Jul  8 09:52:14 srv004 marathon[6896]: [2015-07-08 09:52:14,708] WARN /v2/groups (org.eclipse.jetty.servlet.ServletHandler:563)
Jul  8 09:52:14 srv004 marathon[6896]: java.lang.RuntimeException: while proxying
Jul  8 09:52:14 srv004 marathon[6896]: #011at mesosphere.marathon.api.LeaderProxyFilter.doFilter(LeaderProxyFilter.scala:141)
Jul  8 09:52:14 srv004 marathon[6896]: #011at com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:163)
Jul  8 09:52:14 srv004 marathon[6896]: #011at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58)
Jul  8 09:52:14 srv004 marathon[6896]: #011at com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:118)
Jul  8 09:52:14 srv004 marathon[6896]: #011at com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:113)
Jul  8 09:52:14 srv004 marathon[6896]: #011at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1467)
Jul  8 09:52:14 srv004 marathon[6896]: #011at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:501)
Jul  8 09:52:14 srv004 marathon[6896]: #011at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1086)
Jul  8 09:52:14 srv004 marathon[6896]: #011at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:429)
Jul  8 09:52:14 srv004 marathon[6896]: #011at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1020)
Jul  8 09:52:14 srv004 marathon[6896]: #011at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
Jul  8 09:52:14 srv004 marathon[6896]: #011at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
Jul  8 09:52:14 srv004 marathon[6896]: #011at com.codahale.metrics.jetty8.InstrumentedHandler.handle(InstrumentedHandler.java:192)
Jul  8 09:52:14 srv004 marathon[6896]: #011at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)
Jul  8 09:52:14 srv004 marathon[6896]: #011at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
Jul  8 09:52:14 srv004 marathon[6896]: #011at org.eclipse.jetty.server.Server.handle(Server.java:370)
Jul  8 09:52:14 srv004 marathon[6896]: #011at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:494)
Jul  8 09:52:14 srv004 marathon[6896]: #011at org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:971)
Jul  8 09:52:14 srv004 marathon[6896]: #011at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1033)
Jul  8 09:52:14 srv004 marathon[6896]: #011at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:651)
Jul  8 09:52:14 srv004 marathon[6896]: #011at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
Jul  8 09:52:14 srv004 marathon[6896]: #011at org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:82)
Jul  8 09:52:14 srv004 marathon[6896]: #011at org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:696)
Jul  8 09:52:14 srv004 marathon[6896]: #011at org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:53)
Jul  8 09:52:14 srv004 marathon[6896]: #011at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
Jul  8 09:52:14 srv004 marathon[6896]: #011at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
Jul  8 09:52:14 srv004 marathon[6896]: #011at java.lang.Thread.run(Thread.java:745)
Jul  8 09:52:14 srv004 marathon[6896]: Caused by: java.net.SocketTimeoutException: Read timed out
Jul  8 09:52:14 srv004 marathon[6896]: #011at java.net.SocketInputStream.socketRead0(Native Method)
Jul  8 09:52:14 srv004 marathon[6896]: #011at java.net.SocketInputStream.read(SocketInputStream.java:152)
Jul  8 09:52:14 srv004 marathon[6896]: #011at java.net.SocketInputStream.read(SocketInputStream.java:122)
Jul  8 09:52:14 srv004 marathon[6896]: #011at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
Jul  8 09:52:14 srv004 marathon[6896]: #011at java.io.BufferedInputStream.read1(BufferedInputStream.java:275)
Jul  8 09:52:14 srv004 marathon[6896]: #011at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
Jul  8 09:52:14 srv004 marathon[6896]: #011at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:687)
Jul  8 09:52:14 srv004 marathon[6896]: #011at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:633)
Jul  8 09:52:14 srv004 marathon[6896]: #011at sun.net.www.protocol.http.HttpURLConnection.expect100Continue(HttpURLConnection.java:1021)
Jul  8 09:52:14 srv004 marathon[6896]: #011at sun.net.www.protocol.http.HttpURLConnection.getOutputStream(HttpURLConnection.java:1106)
Jul  8 09:52:14 srv004 marathon[6896]: #011at mesosphere.marathon.api.JavaUrlConnectionRequestForwarder$$anonfun$copyRequestBodyToConnection$1$1.apply(LeaderProxyFilter.scala:238)
Jul  8 09:52:14 srv004 marathon[6896]: #011at mesosphere.marathon.api.JavaUrlConnectionRequestForwarder$$anonfun$copyRequestBodyToConnection$1$1.apply(LeaderProxyFilter.scala:237)
Jul  8 09:52:14 srv004 marathon[6896]: #011at mesosphere.marathon.io.IO$.using(IO.scala:87)
Jul  8 09:52:14 srv004 marathon[6896]: #011at mesosphere.marathon.api.JavaUrlConnectionRequestForwarder.copyRequestBodyToConnection$1(LeaderProxyFilter.scala:237)
Jul  8 09:52:14 srv004 marathon[6896]: #011at mesosphere.marathon.api.JavaUrlConnectionRequestForwarder.copyRequestToConnection$1(LeaderProxyFilter.scala:248)
Jul  8 09:52:14 srv004 marathon[6896]: #011at mesosphere.marathon.api.JavaUrlConnectionRequestForwarder.forward(LeaderProxyFilter.scala:290)
Jul  8 09:52:14 srv004 marathon[6896]: #011at mesosphere.marathon.api.LeaderProxyFilter.doFilter(LeaderProxyFilter.scala:137)
Jul  8 09:52:14 srv004 marathon[6896]: #011... 26 more
Jul  8 09:52:14 srv004 marathon[6896]: [2015-07-08 09:52:14,714] INFO 10.0.0.254 -  -  [08/Jul/2015:09:52:04 +0000] "POST /v2/groups HTTP/1.1" 200 0 "-" "curl/7.37.1" (mesosphere.chaos.http.ChaosRequ
estLog:15)
sttts commented 9 years ago

Marathon 0.9.0 rc3

sttts commented 9 years ago

/cc @kolloch, @aquamatthias – feels like a blocker for the release

aquamatthias commented 9 years ago

Hey @sttts thanks for reporting this - we will investigate.

kolloch commented 9 years ago

Couldn't reproduce this with a local environment yet. I started a leader instance on port 8080 and a non-leader instance on port 8081. I used "--hostname localhost" to advertise the correct host name.

Then I performed a request to the leader:

http -v --json POST localhost:8080/v2/groups quatsch=1
POST /v2/groups HTTP/1.1
Accept: application/json
Accept-Encoding: gzip, deflate
Connection: keep-alive
Content-Length: 16
Content-Type: application/json
Host: localhost:8080
User-Agent: HTTPie/0.9.2

{
    "quatsch": "1"
}

HTTP/1.1 422
Cache-Control: no-cache, no-store, must-revalidate
Content-Type: application/json
Expires: 0
Pragma: no-cache
Server: Jetty(8.1.15.v20140411)
Transfer-Encoding: chunked
X-Marathon-Leader: http://localhost:8080

{
    "errors": [
        {
            "attribute": "id",
            "error": "Property missing which is mandatory"
        }
    ]
}

And the request to the non-leader:

http -v --json POST localhost:8081/v2/groups quatsch=1
POST /v2/groups HTTP/1.1
Accept: application/json
Accept-Encoding: gzip, deflate
Connection: keep-alive
Content-Length: 16
Content-Type: application/json
Host: localhost:8081
User-Agent: HTTPie/0.9.2

{
    "quatsch": "1"
}

HTTP/1.1 422 Unprocessable Entity
Cache-Control: no-cache, no-store, must-revalidate
Content-Length: 77
Content-Type: application/json
Expires: 0
Pragma: no-cache
Server: Jetty(8.1.15.v20140411)
X-Marathon-Leader: http://localhost:8080
X-Marathon-Via: 1.1 localhost:8081

{
    "errors": [
        {
            "attribute": "id",
            "error": "Property missing which is mandatory"
        }
    ]
}
kolloch commented 9 years ago

I can also not reproduce it in our own production environment which runs 0.9.0-RC3

kolloch commented 9 years ago

Just for reference, I also tried with illegal json ("asd") via httpie and curl. I also tried the redis.json above which was provided to me out-of-band.

kolloch commented 9 years ago

Seems to work with httpie but not with curl according to @sttts

sttts commented 9 years ago
$ curl -i -H "Content-Type: application/json" -X POST --trace-ascii /dev/stdout --data-binary @redis-kaputt.json http://srv004:5080/v2/groups                          
== Info: Hostname was NOT found in DNS cache
== Info:   Trying 10.0.0.4...
== Info: Connected to srv004 (10.0.0.4) port 5080 (#0)
=> Send header, 161 bytes (0xa1)
0000: POST /v2/groups HTTP/1.1
001a: User-Agent: curl/7.37.1
0033: Host: srv004:5080
0046: Accept: */*
0053: Content-Type: application/json
0073: Content-Length: 2333
0089: Expect: 100-continue
009f: 
<= Recv header, 23 bytes (0x17)
0000: HTTP/1.1 100 Continue
HTTP/1.1 100 Continue

=> Send data, 2333 bytes (0x91d)
0000: {.."id": "/redis-kaputt",.."groups": [{..."id": "databases",..."
0040: apps": [{...."id": "redis",...."container": {....."type": "DOCKE
0080: R",....."docker": {......"image": "redis:2.8",......"network": "
00c0: HOST".....},....."volumes": [{......"containerPath": "/data",...
0100: ..."hostPath": "/local/data/redis-dev",......"mode": "RW".....},
0140:  {}]....},...."cmd": "touch /data/redis.conf; redis-server /data
0180: /redis.conf --dir /data --loglevel verbose --requirepass geheim 
01c0: --port 6379 --masterauth geheim --min-slaves-to-write 1",...."cp
0200: us": 0.2,...."mem": 256.0,...."instances": 3,...."maxLaunchDelay
0240: Seconds": 5,...."healthChecks": [{....."protocol": "COMMAND",...
0280: .."command": { "value": "env; redis-cli -h $HOST -p 6379 -a gehe
02c0: im -n 1 info | egrep 'role:master|master_link_status:up'" },....
0300: ."gracePeriodSeconds": 20,....."intervalSeconds": 20,....."portI
0340: ndex": 0,....."timeoutSeconds": 10,....."maxConsecutiveFailures"
0380: : 3....}],...."upgradeStrategy": {....."minimumHealthCapacity": 
03c0: 0.51,....."maximumOverCapacity": 0....},...."constraints": [. ..
0400: ..["hostname", "UNIQUE"],.....["hostname", "LIKE", "srv004|srv00
0440: 5|srv006"]....]...}, {...."id": "sentinel",...."container": {...
0480: .."type": "DOCKER",....."docker": {......"image": "redis:2.8",..
04c0: ...."network": "HOST".....},....."volumes": [{......"containerPa
0500: th": "/data",......"hostPath": "/local/data/redis-sentinel-dev",
0540: ......"mode": "RW".....}]....},...."cmd": "touch /data/redis.con
0580: f; eval \"redis-sentinel /data/redis.conf --dir /data --loglevel
05c0:  verbose --port 26379 --masterauth geheim $(grep -q 'sentinel mo
0600: nitor' /data/redis.conf || echo '--sentinel monitor mymaster $(r
0640: edis-cli -a geheim -p 6379 info | grep master_host: | cut -f2 -d
0680: :) 6379 2 --sentinel auth-pass mymaster geheim') --sentinel down
06c0: -after-milliseconds 5000\"",...."cpus": 0.1,...."mem": 64.0,....
0700: "instances": 3,...."maxLaunchDelaySeconds": 5,...."healthChecks"
0740: : [{....."protocol": "COMMAND",....."command": { "value": "redis
0780: -cli -h $HOST -p 26379 -n 1 info | grep sentinel_masters:1" },..
07c0: ..."gracePeriodSeconds": 30,....."intervalSeconds": 30,....."por
0800: tIndex": 0,....."timeoutSeconds": 10,....."maxConsecutiveFailure
0840: s": 2....}],...."upgradeStrategy": {....."minimumHealthCapacity"
0880: : 0.51,....."maximumOverCapacity": 0....},...."constraints": [. 
08c0: ....["hostname", "UNIQUE"],.....["hostname", "LIKE", "srv004|srv
0900: 005|srv006"]....]...}]..}].}.

Gives HTTP 200

kolloch commented 9 years ago

Might be related to http://bugs.java.com/view_bug.do?bug_id=8012625

kolloch commented 9 years ago

I can reproduce the bug with https://gist.github.com/kolloch/b0ff39e96c3bbd083b8a, I will try with Java 8 now.

kolloch commented 9 years ago

0.8.2 already contains the issue. So this is not a regression blocking the release of 0.9.0 in my opinion.

Upgrading to Java 8 fixes the problem. Maybe we should do this soon?

https://github.com/mesosphere/marathon/issues/1544

kolloch commented 9 years ago

Further note: curl only sets the Expect-Header if the payload reaches a certain size.

air commented 9 years ago

Nice catch @kolloch!

kolloch commented 9 years ago

We will recommend Java 8 for Marathon 0.11. This should be solved by this.