istio / old_issues_repo

Deprecated issue-tracking repo, please post new issues or feature requests to istio/istio instead.
37 stars 9 forks source link

Got 503 when access bookinfo productpage on IBM Bluemix Container Service #39

Closed dilingchen closed 6 years ago

dilingchen commented 7 years ago

I got 503 when run _curl -o /dev/null -s -w "%{http_code}\n" http://${GATEWAY_URL}/productpage_

[root@c582f1-n28-vm1 ~]# curl -v http://169.47.115.162/productpage
* About to connect() to 169.47.115.162 port 80 (#0)
*   Trying 169.47.115.162...
* Connected to 169.47.115.162 (169.47.115.162) port 80 (#0)
> GET /productpage HTTP/1.1
> User-Agent: curl/7.29.0
> Host: 169.47.115.162
> Accept: */*
> 
< HTTP/1.1 503 Service Unavailable
< content-length: 57
< content-type: text/plain
< date: Thu, 22 Jun 2017 10:49:09 GMT
< server: envoy
< Connection: Keep-Alive
< 
* Connection #0 to host 169.47.115.162 left intact
upstream connect error or disconnect/reset before headers

I can see the ingress address, shown as the same in svc istio-ingress

# kubectl get ingress -o wide
NAME      HOSTS     ADDRESS          PORTS     AGE
gateway   *         169.xx.xx.xxx   80        2d
# kubectl get svc istio-ingress
NAME            CLUSTER-IP   EXTERNAL-IP      PORT(S)                      AGE
istio-ingress   10.10.10.6   169.xx.xx.xxx    80:30835/TCP,443:30012/TCP   2d
# export GATEWAY_URL=169.xx.xx.xxx:80

nothing in the log of productpage container kubectl logs productpage-v1-3210513262-nr4gm productpage

log in proxy container kubectl logs productpage-v1-3210513262-nr4gm proxy

I0622 14:26:47.020826       1 controller.go:152] Event update: key "default/istio-ingress-controller-leader-istio"

The full log of proxy proxy.txt

I verified that the product page can be shown in the browser when use port-forward and access with url http://localhost:9080/productpage

What might be the issue? Thanks.

GregHanson commented 7 years ago

can you try attempt the curl a few more times and then get the logs for the istio-ingress pod?

dilingchen commented 7 years ago

@GregHanson here is the logs for the istio-ingress pod. Thanks. istio-ingress-3255123743-hlh71.txt

frankbu commented 7 years ago

Looks like the ingress can't talk to productpage. Can you try to exec into the ingress pod and see what RDS, CDS, and SDS are returning. Should see something like this:

$ kubectl exec -it <ingress-pod> bash
# curl istio-pilot:8080/v1/routes/80/istio-proxy/ingress
{
  "virtual_hosts": [
   {
    "name": "*",
    "domains": [
     "*"
    ],
    "routes": [
     {
      "path": "/login",
      "cluster": "out.4bdc5a0e59af7107a7189467360a720381024b5c"
     },
     {
      "path": "/logout",
      "cluster": "out.4bdc5a0e59af7107a7189467360a720381024b5c"
     },
     {
      "path": "/productpage",
      "cluster": "out.4bdc5a0e59af7107a7189467360a720381024b5c"
     }
    ]
   }
  ]
}
# curl istio-pilot:8080/v1/clusters/istio-proxy/ingress
{
  "clusters": [
   {
    "name": "out.4bdc5a0e59af7107a7189467360a720381024b5c",
    "service_name": "productpage.default.svc.cluster.local|http|version=v1",
    "connect_timeout_ms": 1000,
    "type": "sds",
    "lb_type": "round_robin"
   }
  ]
}
# curl "istio-pilot:8080/v1/registration/productpage.default.svc.cluster.local|http|version=v1"
{
  "hosts": [
   {
    "ip_address": "172.17.0.12",
    "port": 9080
   }
  ]
}
dilingchen commented 7 years ago

@frankbu

Anything is wrong in below output?

[root@c582f1-n28-vm1 grafana]# kubectl exec -it istio-ingress-3255123743-hlh71 bash
root@istio-ingress-3255123743-hlh71:/# curl istio-pilot:8080/v1/routes/80/istio-proxy/ingress
{
  "virtual_hosts": [
   {
    "name": "*",
    "domains": [
     "*"
    ],
    "routes": [
     {
      "path": "/login",
      "cluster": "out.304d4aa908c3ed9923066c887a9466a0755bd896"
     },
     {
      "path": "/logout",
      "cluster": "out.304d4aa908c3ed9923066c887a9466a0755bd896"
     },
     {
      "path": "/productpage",
      "cluster": "out.304d4aa908c3ed9923066c887a9466a0755bd896"
     }
    ]
   }
  ]
 }root@istio-ingress-3255123743-hlh71:/# curl istio-pilot:8080/v1/clusters/istio-proxy/ingress
{
  "clusters": [
   {
    "name": "out.304d4aa908c3ed9923066c887a9466a0755bd896",
    "service_name": "productpage.default.svc.cluster.local|http",
    "connect_timeout_ms": 1000,
    "type": "sds",
    "lb_type": "round_robin"
   }
  ]
 }root@istio-ingress-3255123743-hlh71:/# curl "istio-pilot:8080/v1/registration/productpage.default.svc.cluster.local|http|version=v1"
{
  "hosts": [
   {
    "ip_address": "172.30.217.141",
    "port": 9080
   }
  ]
 }root@istio-ingress-3255123743-hlh71:/# 
rshriram commented 7 years ago

On Tue, Jun 27, 2017 at 3:17 AM Diling Chen notifications@github.com wrote:

@frankbu https://github.com/frankbu

Anything is wrong in below output?

[root@c582f1-n28-vm1 grafana]# kubectl exec -it istio-ingress-3255123743-hlh71 bash root@istio-ingress-3255123743-hlh71:/# curl istio-pilot:8080/v1/routes/80/istio-proxy/ingress { "virtual_hosts": [ { "name": "", "domains": [ "" ], "routes": [ { "path": "/login", "cluster": "out.304d4aa908c3ed9923066c887a9466a0755bd896" }, { "path": "/logout", "cluster": "out.304d4aa908c3ed9923066c887a9466a0755bd896" }, { "path": "/productpage", "cluster": "out.304d4aa908c3ed9923066c887a9466a0755bd896" } ] } ] }root@istio-ingress-3255123743-hlh71:/# curl istio-pilot:8080/v1/clusters/istio-proxy/ingress { "clusters": [ { "name": "out.304d4aa908c3ed9923066c887a9466a0755bd896",

"service_name": "productpage.default.svc.cluster.local|http",
"connect_timeout_ms": 1000,
"type": "sds",
"lb_type": "round_robin"

} ] }root@istio-ingress-3255123743-hlh71:/# curl "istio-pilot:8080/v1/registration/productpage.default.svc.cluster.local|http|version=v1"

Can you curl this API with "out.304d4aa908c3ed9923066c887a9466a0755bd896" instead of productpage.default...?

{ "hosts": [ { "ip_address": "172.30.217.141", "port": 9080 } ] }root@istio-ingress-3255123743-hlh71:/#

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/istio/issues/issues/39#issuecomment-311273979, or mute the thread https://github.com/notifications/unsubscribe-auth/AH0qd_eVlxgxsksqW8bzn7Rzf9CshhDiks5sIKx0gaJpZM4OCZAP .

--

~shriram

frankbu commented 7 years ago

@Linda009 it looks right. Can you confirm that 172.30.217.141 is the IP of the productpage pod and check if you can curl 172.30.217.141:9080/productpage. Seems like that is where it must be failing.

dilingchen commented 7 years ago

@frankbu Yes, 172.30.217.141 is the IP of the productpage pod and curl 172.30.217.141:9080/productpage is failing

[root@c582f1-n28-vm1 ~]# kubectl get pods -o wide
NAME                              READY     STATUS    RESTARTS   AGE       IP               NODE
details-v1-3207759430-rsjfk       2/2       Running   0          9d        172.30.217.135   169.47.101.180
grafana-4175437752-4f205          1/1       Running   0          9d        172.30.217.134   169.47.101.180
istio-ca-4190117216-kt7hz         1/1       Running   0          5d        172.30.222.139   169.47.101.168
istio-egress-1880612815-ktbvt     1/1       Running   0          9d        172.30.217.133   169.47.101.180
istio-ingress-3255123743-hlh71    1/1       Running   0          5d        172.30.222.141   169.47.101.168
istio-mixer-2598054512-d4770      1/1       Running   0          9d        172.30.217.130   169.47.101.180
istio-pilot-2676867826-jglrs      2/2       Running   0          9d        172.30.222.130   169.47.101.168
productpage-v1-3210513262-85ll7   2/2       Running   0          5d        172.30.217.141   169.47.101.180
prometheus-3208567892-27qvm       1/1       Running   0          9d        172.30.222.132   169.47.101.168
ratings-v1-832276092-03jzw        2/2       Running   0          9d        172.30.12.200    169.47.101.177
reviews-v1-2925430435-1tzsq       2/2       Running   0          9d        172.30.217.136   169.47.101.180
reviews-v2-3541796517-msw21       2/2       Running   0          9d        172.30.222.136   169.47.101.168
reviews-v3-4158162599-rb5dv       2/2       Running   0          9d        172.30.12.201    169.47.101.177
servicegraph-3117540837-vqmhn     1/1       Running   0          9d        172.30.222.134   169.47.101.168
[root@c582f1-n28-vm1 ~]# kubectl exec -it istio-ingress-3255123743-hlh71 bash
root@istio-ingress-3255123743-hlh71:/# curl -v 172.30.217.141:9080/productpage
*   Trying 172.30.217.141...
* Connected to 172.30.217.141 (172.30.217.141) port 9080 (#0)
> GET /productpage HTTP/1.1
> Host: 172.30.217.141:9080
> User-Agent: curl/7.47.0
> Accept: */*
> 
* Recv failure: Connection reset by peer
* Closing connection 0
curl: (56) Recv failure: Connection reset by peer
root@istio-ingress-3255123743-hlh71:/# 
frankbu commented 7 years ago

This is strange since you did say that you were able to access the productpage service using port forwarding, so that implies the service is up and running. We also just confirmed that the Istio routes have been generated correctly and it is trying to call the right thing.

Do you see any related messages in the error logs of the ingress pod or the proxy container of the productpage pod?

I would try to see if you can access the productpage from pods other than the ingress. For example, exec into the ratings pod and see if you can curl the productpage from there.

Other things I would suggest is start up the sleep sample and httpbin sample and see if you can curl from and to those pods in your cluster.

dilingchen commented 7 years ago

@frankbu

I deleted the ingress and productpage pods to get clean logs. Do not see any errors in the logs.

istio-ingress-3255123743-17m7h.txt productpage-v1-3210513262-hxdfh-proxy.txt

curl from rating pod

root@ratings-v1-832276092-03jzw:/usr/src/app# curl -v 172.30.222.144:9080/productpage
* Hostname was NOT found in DNS cache
*   Trying 172.30.222.144...
* Connected to 172.30.222.144 (172.30.222.144) port 9080 (#0)
> GET /productpage HTTP/1.1
> User-Agent: curl/7.38.0
> Host: 172.30.222.144:9080
> Accept: */*
> 
< HTTP/1.1 404 Not Found
< date: Fri, 30 Jun 2017 02:21:45 GMT
* Server envoy is not blacklisted
< server: envoy
< content-length: 0
< 
* Connection #0 to host 172.30.222.144 left intact
root@ratings-v1-832276092-03jzw:/usr/src/app# 

curl inside the productpage pod

root@productpage-v1-3210513262-hxdfh:/opt/microservices# curl -v 172.30.222.144:9080/productpage
* Hostname was NOT found in DNS cache
*   Trying 172.30.222.144...
* Connected to 172.30.222.144 (172.30.222.144) port 9080 (#0)
> GET /productpage HTTP/1.1
> User-Agent: curl/7.38.0
> Host: 172.30.222.144:9080
> Accept: */*
> 
* Recv failure: Connection reset by peer
* Closing connection 0
curl: (56) Recv failure: Connection reset by peer
root@productpage-v1-3210513262-hxdfh:/opt/microservices# curl -v localhost:9080/productpage
* Hostname was NOT found in DNS cache
*   Trying ::1...
* connect to ::1 port 9080 failed: Connection refused
*   Trying 127.0.0.1...
* Connected to localhost (127.0.0.1) port 9080 (#0)
> GET /productpage HTTP/1.1
> User-Agent: curl/7.38.0
> Host: localhost:9080
> Accept: */*
> 
* HTTP 1.0, assume close after body
< HTTP/1.0 200 OK
< Content-Type: text/html; charset=utf-8
< Content-Length: 3674
< Server: Werkzeug/0.11.11 Python/2.7.12
< Date: Fri, 30 Jun 2017 02:37:32 GMT
< 
<!DOCTYPE html>
<html>
  <head>
    <title>Simple Bookstore App</title>
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width, initial-scale=1.0">

<!-- Latest compiled and minified CSS -->
<link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.5/css/bootstrap.min.css">

<!-- Optional theme -->
<link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.5/css/bootstrap-theme.min.css">

  </head>
  <body>

<nav class="navbar navbar-inverse navbar-static-top">
    <div class="container">
        <div class="navbar-header">
            <a class="navbar-brand" href="#">BookInfo Sample</a>
        </div>

        <button type="button" class="btn btn-default navbar-btn navbar-right" data-toggle="modal" href="#login-modal">Sign in</button>

    </div>
</nav>

<!---
<div class="navbar navbar-inverse navbar-fixed-top">
  <div class="container">
    <div class="navbar-header pull-left">
      <a class="navbar-brand" href="#">Microservices Fabric BookInfo Demo</a>
    </div>
    <div class="navbar-header pull-right">
      <button type="button" class="navbar-toggle" data-toggle="collapse" data-target=".navbar-collapse">
        <span class="icon-bar"></span>
        <span class="icon-bar"></span>
        <span class="icon-bar"></span>
      </button>
    </div>
    <div class="navbar-collapse collapse">

      <button type="button" class="btn btn-default navbar-btn pull-right" data-toggle="modal" data-target="#login-modal">Sign in</button>

    </div>
  </div>
</div>
-->

<div id="login-modal" class="modal fade" role="dialog">
  <div class="modal-dialog">
    <div class="modal-content">
      <div class="modal-header">
        <button type="button" class="close" data-dismiss="modal">&times;</button>
        <h4 class="modal-title">Please sign in</h4>
      </div>
      <div class="modal-body">
        <form method="post" action='login' name="login_form">
          <p><input type="text" class="form-control" name="username" id="username" placeholder="User Name"></p>
          <p><input type="password" class="form-control" name="passwd" placeholder="Password"></p>
          <p>
             <button type="submit" class="btn btn-primary">Sign in</button>
             <button type="button" class="btn btn-default" data-dismiss="modal">Cancel</button>
          </p>
        </form>
      </div>
    </div>

  </div>
</div>

<div class="container-fluid">
<div class="row">
<div class="col-md-12">
    <h3 class="text-center text-primary">The Comedy of Errors</h3>
    <p> <a href="https://en.wikipedia.org/wiki/The_Comedy_of_Errors">Wikipedia
    Summary</a>: The Comedy of Errors is one of <b>William
    Shakespeare's</b> early plays. It is his shortest and one of his
    most farcical comedies, with a major part of the humour coming
    from slapstick and mistaken identity, in addition to puns and word
    play.</p>
</div>
</div>

<div class="row">
<div class="col-md-6">

<h3>Sorry, product details are currently unavailable for this book.</h3>

</div>
<div class="col-md-6">

<h3>Sorry, product reviews are currently unavailable for this book.</h3>

</div>
</div>
</div>

<!-- Latest compiled and minified JavaScript -->
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.4/jquery.min.js"></script>

<!-- Latest compiled and minified JavaScript -->
<script src="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.5/js/bootstrap.min.js"></script>

<script type="text/javascript">
$('#login-modal').on('shown.bs.modal', function () {
     $('#username').focus();
});
</script>

  </body>
</html>
* Closing connection 0
root@productpage-v1-3210513262-hxdfh:/opt/microservices# exit
exit
[root@c582f1-n28-vm1 ~]# 

So it can not work when use cluster IP, it can work when use localhost.

frankbu commented 7 years ago

Please try to run the sleep service (https://github.com/istio/istio/tree/master/samples/apps/sleep) and then from it (exec -it <sleep pod> -c sleep bash) try the following:

curl -v ratings:9080/ratings
curl -v productpage:9080/productpage

If both of those don't work, please try one more thing. Run the nttpbin service (https://github.com/istio/istio/tree/master/samples/apps/httpbin) and then try to curl it from the sleep service:

curl -v httpbin:8000/headers
dilingchen commented 7 years ago

@frankbu unfortunately, none of them can work.

[root@c582f1-n28-vm1 istio-0.1.6]# kubectl exec -it sleep-216554478-2c431 -c sleep bash
root@sleep-216554478-2c431:/# curl -v ratings:9080/ratings
* Hostname was NOT found in DNS cache
*   Trying 10.10.10.79...
* Connected to ratings (10.10.10.79) port 9080 (#0)
> GET /ratings HTTP/1.1
> User-Agent: curl/7.35.0
> Host: ratings:9080
> Accept: */*
> 
< HTTP/1.1 503 Service Unavailable
< content-length: 57
< content-type: text/plain
< date: Sat, 01 Jul 2017 13:17:37 GMT
* Server envoy is not blacklisted
< server: envoy
< 
* Connection #0 to host ratings left intact
upstream connect error or disconnect/reset before headers
root@sleep-216554478-2c431:/# curl -v productpage:9080/productpage
* Hostname was NOT found in DNS cache
*   Trying 10.10.10.121...
* Connected to productpage (10.10.10.121) port 9080 (#0)
> GET /productpage HTTP/1.1
> User-Agent: curl/7.35.0
> Host: productpage:9080
> Accept: */*
> 
< HTTP/1.1 503 Service Unavailable
< content-length: 57
< content-type: text/plain
< date: Sat, 01 Jul 2017 13:18:01 GMT
* Server envoy is not blacklisted
< server: envoy
< 
* Connection #0 to host productpage left intact
upstream connect error or disconnect/reset before headers
[root@c582f1-n28-vm1 istio-0.1.6]# kubectl exec -it sleep-216554478-2c431 -c sleep bash
root@sleep-216554478-2c431:/# curl -v httpbin:8000/headers
* Hostname was NOT found in DNS cache
*   Trying 10.10.10.22...
* Connected to httpbin (10.10.10.22) port 8000 (#0)
> GET /headers HTTP/1.1
> User-Agent: curl/7.35.0
> Host: httpbin:8000
> Accept: */*
> 
< HTTP/1.1 503 Service Unavailable
< content-length: 57
< content-type: text/plain
< date: Sat, 01 Jul 2017 13:26:00 GMT
* Server envoy is not blacklisted
< server: envoy
< 
* Connection #0 to host httpbin left intact
upstream connect error or disconnect/reset before headersroot@sleep-216554478-2c431:/# 
frankbu commented 7 years ago

Sounds like maybe networking in you cluster isn't working at all, not just istio. Have you tried to curl between plain k8s (no kube-inject) services? Have you tried to kill everything and restart the whole thing?

tpiecora commented 7 years ago

@frankbu I have the exact same scenario. Have tried all that was said above with the same results. Running Tectonic/k8s 1.6.7 with istio 0.1.6 on AWS.

I can set up a simple echo service/deployment using plain k8s and can curl that from the ingress pod.

Any further insights on what the issue might be? @Linda009 did you ever come to a solution?

yiakwy commented 7 years ago

HI @frankbu , the out is that the

root@istio-ingress-1054723629-g47vg:/# curl istio-pilot:8080/v1/clusters/istio-proxy/ingress
{
  "clusters": [
   {
    "name": "out.304d4aa908c3ed9923066c887a9466a0755bd896",
    "service_name": "productpage.default.svc.cluster.local|http",
    "connect_timeout_ms": 1000,
    "type": "sds",
    "lb_type": "round_robin"
   }
  ]
 }

then:

root@istio-ingress-1054723629-g47vg:/# curl "istio-pilot:8080/v1/registration/productpage.default.svc.cluster.local|http"
{
  "hosts": []
 }

So what the hell it is?

frankbu commented 7 years ago

@yiakwy you're problem is different than the one we're discussing here. It looks like the productpage pod, for whatever reason, failed to run up properly (there's no registered instance).

@tpiecora Interesting. By exact same problem, do you mean all the xDS curls generated the correct expected output and you also tried all the sleep service and httpbin service tests above? This is a very strange situation that so far seems to have only been happening to @Linda009. I had assumed it has been resolved, given no more info. If you want to try to debug from basics, I would suggest shutting down Istio first, and then 1) run the sleep and httpbin services using plain k8s (no kube-inject) and confirm you can curl httpbin from the sleep pod. 2) stop the two services, and then install istio. 3) run sleep and httpbin again (with kube-inject), and see if you can still curl or not. This will give you a minimal test case, where hopefully we can get to the bottom of this.

yiakwy commented 7 years ago

Hi @frankbu Thanks to @mwieczorek, by using command

kubectl describe pod  -l app=productpage

I quickly got report that "insufficient cpu". Hence I expanded my container nodes to 8 cores (4 nodes do not work ) in GCE. Then I killed all relevant services before I restarted them again. Thanks to god, everything works as expected.

I suggest that adding this command to document for beginners to check out what is going on. Thank you again!

errordeveloper commented 7 years ago

I am seeing the same issue on GKE, using Istio 0.1.6, and Kubernetes 1.6.7.

errordeveloper commented 7 years ago

I am seeing the same issue on GKE, using Istio 0.1.6, and Kubernetes 1.6.7.

I've re-configured everything from scratch, and I'm not seeing the issue any more. I believe I must have glanced over what's said in step 5 and applied istio.yaml as well as istio-auth.yaml.

kyessenov commented 7 years ago

Glad to hear that.

arycloud commented 7 years ago

I have resolved this issue by removing grafana, Prometheus and service graph pods and create k8s cluster on GKE with 8 nodes.

hzxuzhonghu commented 7 years ago

I got code 429 when curl istio-ingress service

GregHanson commented 6 years ago

@Linda009 @hzxuzhonghu are you still having the same problems in istio verison 0.5.1 or later?

hzxuzhonghu commented 6 years ago

@GregHanson Sorry, not use it recently.

GregHanson commented 6 years ago

closing issue until more info is provided

iftachsc commented 5 years ago

hitting the same with istio1.0.5