skupperproject / skupper

Skupper is an implementation of a Virtual Application Network, enabling rich hybrid cloud communication.
http://skupper.io
Apache License 2.0
580 stars 70 forks source link

bridge server crashes with error #320

Closed kungfuchicken closed 3 years ago

kungfuchicken commented 3 years ago

Shortly after starting up the bridge-server container crashes with this error:

2020-11-04T19:53:57.592Z bridge-server info sender eddae9cf-b472-af45-965a-463a2267499a closed for socket 10.196.5.97:60384@f1539762-ae1d-4e8d-8dd0-37eb78fab465
2020-11-04T19:53:57.602Z bridge-server info [skupper-router-68879cd7-cq44r_amqp_nats-cloud-gateway_to_tcp_10.196.5.97_1027] receiver attached
2020-11-04T19:53:57.605Z bridge-server info [skupper-router-68879cd7-cq44r_amqp_nats-cloud-gateway_to_tcp_10.196.5.97_1027] receiver attached
2020-11-04T19:53:57.605Z bridge-server info [skupper-router-68879cd7-cq44r_amqp_nats-cloud-gateway_to_tcp_10.196.5.97_1027] receiver attached
2020-11-04T19:53:57.606Z bridge-server info [skupper-router-68879cd7-cq44r_tcp_1027_to_amqp_nats-cloud-gateway] socket disconnected 10.196.5.97:60420
2020-11-04T19:53:57.607Z bridge-server info tcp ingress connection close {"address":"nats-cloud-gateway","protocol":"tcp","ingress":{"10.196.5.97:60364@f1539762-ae1d-4e8d-8dd0-37eb78fab465":{"id":"10.196.5.97:60364@f1539762-ae1d-4e8d-8dd0-37eb78fab465","start_time":1604519637211,"bytes_in":0,"bytes_out":0,"client":"10.196.5.97"},"10.196.5.97:60366@f1539762-ae1d-4e8d-8dd0-37eb78fab465":{"id":"10.196.5.97:60366@f1539762-ae1d-4e8d-8dd0-37eb78fab465","start_time":1604519637213,"bytes_in":0,"bytes_out":0,"client":"10.196.5.97"},"10.196.5.97:60368@f1539762-ae1d-4e8d-8dd0-37eb78fab465":{"id":"10.196.5.97:60368@f1539762-ae1d-4e8d-8dd0-37eb78fab465","start_time":1604519637216,"bytes_in":0,"bytes_out":0,"client":"10.196.5.97"},"10.196.5.97:60388@f1539762-ae1d-4e8d-8dd0-37eb78fab465":{"id":"10.196.5.97:60388@f1539762-ae1d-4e8d-8dd0-37eb78fab465","start_time":1604519637303,"bytes_in":0,"bytes_out":0,"client":"10.196.5.97"},"10.196.5.97:60390@f1539762-ae1d-4e8d-8dd0-37eb78fab465":{"id":"10.196.5.97:60390@f1539762-ae1d-4e8d-8dd0-37eb78fab465","start_time":1604519637307,"bytes_in":0,"bytes_out":0,"client":"10.196.5.97"},"10.196.5.97:60392@f1539762-ae1d-4e8d-8dd0-37eb78fab465":{"id":"10.196.5.97:60392@f1539762-ae1d-4e8d-8dd0-37eb78fab465","start_time":1604519637308,"bytes_in":0,"bytes_out":0,"client":"10.196.5.97"},"10.196.5.97:60412@f1539762-ae1d-4e8d-8dd0-37eb78fab465":{"id":"10.196.5.97:60412@f1539762-ae1d-4e8d-8dd0-37eb78fab465","start_time":1604519637407,"bytes_in":0,"bytes_out":0,"client":"10.196.5.97"},"10.196.5.97:60416@f1539762-ae1d-4e8d-8dd0-37eb78fab465":{"id":"10.196.5.97:60416@f1539762-ae1d-4e8d-8dd0-37eb78fab465","start_time":1604519637417,"bytes_in":0,"bytes_out":0,"client":"10.196.5.97"},"10.196.5.97:60418@f1539762-ae1d-4e8d-8dd0-37eb78fab465":{"id":"10.196.5.97:60418@f1539762-ae1d-4e8d-8dd0-37eb78fab465","start_time":1604519637419,"bytes_in":0,"bytes_out":0,"client":"10.196.5.97"},"10.196.5.97:60442@f1539762-ae1d-4e8d-8dd0-37eb78fab465":{"id":"10.196.5.97:60442@f1539762-ae1d-4e8d-8dd0-37eb78fab465","start_time":1604519637537,"bytes_in":0,"bytes_out":0,"client":"10.196.5.97"},"10.196.5.97:60444@f1539762-ae1d-4e8d-8dd0-37eb78fab465":{"id":"10.196.5.97:60444@f1539762-ae1d-4e8d-8dd0-37eb78fab465","start_time":1604519637541,"bytes_in":0,"bytes_out":0,"client":"10.196.5.97"},"10.196.5.97:60446@f1539762-ae1d-4e8d-8dd0-37eb78fab465":{"id":"10.196.5.97:60446@f1539762-ae1d-4e8d-8dd0-37eb78fab465","start_time":1604519637543,"bytes_in":0,"bytes_out":0,"client":"10.196.5.97"},"10.196.5.97:60448@f1539762-ae1d-4e8d-8dd0-37eb78fab465":{"id":"10.196.5.97:60448@f1539762-ae1d-4e8d-8dd0-37eb78fab465","start_time":1604519637546,"bytes_in":0,"bytes_out":0,"client":"10.196.5.97"},"10.196.5.97:60450@f1539762-ae1d-4e8d-8dd0-37eb78fab465":{"id":"10.196.5.97:60450@f1539762-ae1d-4e8d-8dd0-37eb78fab465","start_time":1604519637548,"bytes_in":0,"bytes_out":0,"client":"10.196.5.97"},"10.196.5.97:60452@f1539762-ae1d-4e8d-8dd0-37eb78fab465":{"id":"10.196.5.97:60452@f1539762-ae1d-4e8d-8dd0-37eb78fab465","start_time":1604519637552,"bytes_in":0,"bytes_out":0,"client":"10.196.5.97"}},"egress":{}}
2020-11-04T19:53:57.607Z bridge-server info [skupper-router-68879cd7-cq44r_tcp_1027_to_amqp_nats-cloud-gateway] socket disconnected 10.196.5.97:60448
2020-11-04T19:53:57.607Z bridge-server info tcp ingress connection close {"address":"nats-cloud-gateway","protocol":"tcp","ingress":{"10.196.5.97:60364@f1539762-ae1d-4e8d-8dd0-37eb78fab465":{"id":"10.196.5.97:60364@f1539762-ae1d-4e8d-8dd0-37eb78fab465","start_time":1604519637211,"bytes_in":0,"bytes_out":0,"client":"10.196.5.97"},"10.196.5.97:60366@f1539762-ae1d-4e8d-8dd0-37eb78fab465":{"id":"10.196.5.97:60366@f1539762-ae1d-4e8d-8dd0-37eb78fab465","start_time":1604519637213,"bytes_in":0,"bytes_out":0,"client":"10.196.5.97"},"10.196.5.97:60368@f1539762-ae1d-4e8d-8dd0-37eb78fab465":{"id":"10.196.5.97:60368@f1539762-ae1d-4e8d-8dd0-37eb78fab465","start_time":1604519637216,"bytes_in":0,"bytes_out":0,"client":"10.196.5.97"},"10.196.5.97:60388@f1539762-ae1d-4e8d-8dd0-37eb78fab465":{"id":"10.196.5.97:60388@f1539762-ae1d-4e8d-8dd0-37eb78fab465","start_time":1604519637303,"bytes_in":0,"bytes_out":0,"client":"10.196.5.97"},"10.196.5.97:60390@f1539762-ae1d-4e8d-8dd0-37eb78fab465":{"id":"10.196.5.97:60390@f1539762-ae1d-4e8d-8dd0-37eb78fab465","start_time":1604519637307,"bytes_in":0,"bytes_out":0,"client":"10.196.5.97"},"10.196.5.97:60392@f1539762-ae1d-4e8d-8dd0-37eb78fab465":{"id":"10.196.5.97:60392@f1539762-ae1d-4e8d-8dd0-37eb78fab465","start_time":1604519637308,"bytes_in":0,"bytes_out":0,"client":"10.196.5.97"},"10.196.5.97:60412@f1539762-ae1d-4e8d-8dd0-37eb78fab465":{"id":"10.196.5.97:60412@f1539762-ae1d-4e8d-8dd0-37eb78fab465","start_time":1604519637407,"bytes_in":0,"bytes_out":0,"client":"10.196.5.97"},"10.196.5.97:60416@f1539762-ae1d-4e8d-8dd0-37eb78fab465":{"id":"10.196.5.97:60416@f1539762-ae1d-4e8d-8dd0-37eb78fab465","start_time":1604519637417,"bytes_in":0,"bytes_out":0,"client":"10.196.5.97"},"10.196.5.97:60418@f1539762-ae1d-4e8d-8dd0-37eb78fab465":{"id":"10.196.5.97:60418@f1539762-ae1d-4e8d-8dd0-37eb78fab465","start_time":1604519637419,"bytes_in":0,"bytes_out":0,"client":"10.196.5.97"},"10.196.5.97:60442@f1539762-ae1d-4e8d-8dd0-37eb78fab465":{"id":"10.196.5.97:60442@f1539762-ae1d-4e8d-8dd0-37eb78fab465","start_time":1604519637537,"bytes_in":0,"bytes_out":0,"client":"10.196.5.97"},"10.196.5.97:60444@f1539762-ae1d-4e8d-8dd0-37eb78fab465":{"id":"10.196.5.97:60444@f1539762-ae1d-4e8d-8dd0-37eb78fab465","start_time":1604519637541,"bytes_in":0,"bytes_out":0,"client":"10.196.5.97"},"10.196.5.97:60446@f1539762-ae1d-4e8d-8dd0-37eb78fab465":{"id":"10.196.5.97:60446@f1539762-ae1d-4e8d-8dd0-37eb78fab465","start_time":1604519637543,"bytes_in":0,"bytes_out":0,"client":"10.196.5.97"},"10.196.5.97:60450@f1539762-ae1d-4e8d-8dd0-37eb78fab465":{"id":"10.196.5.97:60450@f1539762-ae1d-4e8d-8dd0-37eb78fab465","start_time":1604519637548,"bytes_in":0,"bytes_out":0,"client":"10.196.5.97"},"10.196.5.97:60452@f1539762-ae1d-4e8d-8dd0-37eb78fab465":{"id":"10.196.5.97:60452@f1539762-ae1d-4e8d-8dd0-37eb78fab465","start_time":1604519637552,"bytes_in":0,"bytes_out":0,"client":"10.196.5.97"}},"egress":{}}
2020-11-04T19:53:57.609Z bridge-server info [skupper-router-68879cd7-cq44r_tcp_1027_to_amqp_nats-cloud-gateway] socket disconnected 10.196.5.97:60452
2020-11-04T19:53:57.609Z bridge-server info tcp ingress connection close {"address":"nats-cloud-gateway","protocol":"tcp","ingress":{"10.196.5.97:60364@f1539762-ae1d-4e8d-8dd0-37eb78fab465":{"id":"10.196.5.97:60364@f1539762-ae1d-4e8d-8dd0-37eb78fab465","start_time":1604519637211,"bytes_in":0,"bytes_out":0,"client":"10.196.5.97"},"10.196.5.97:60366@f1539762-ae1d-4e8d-8dd0-37eb78fab465":{"id":"10.196.5.97:60366@f1539762-ae1d-4e8d-8dd0-37eb78fab465","start_time":1604519637213,"bytes_in":0,"bytes_out":0,"client":"10.196.5.97"},"10.196.5.97:60368@f1539762-ae1d-4e8d-8dd0-37eb78fab465":{"id":"10.196.5.97:60368@f1539762-ae1d-4e8d-8dd0-37eb78fab465","start_time":1604519637216,"bytes_in":0,"bytes_out":0,"client":"10.196.5.97"},"10.196.5.97:60388@f1539762-ae1d-4e8d-8dd0-37eb78fab465":{"id":"10.196.5.97:60388@f1539762-ae1d-4e8d-8dd0-37eb78fab465","start_time":1604519637303,"bytes_in":0,"bytes_out":0,"client":"10.196.5.97"},"10.196.5.97:60390@f1539762-ae1d-4e8d-8dd0-37eb78fab465":{"id":"10.196.5.97:60390@f1539762-ae1d-4e8d-8dd0-37eb78fab465","start_time":1604519637307,"bytes_in":0,"bytes_out":0,"client":"10.196.5.97"},"10.196.5.97:60392@f1539762-ae1d-4e8d-8dd0-37eb78fab465":{"id":"10.196.5.97:60392@f1539762-ae1d-4e8d-8dd0-37eb78fab465","start_time":1604519637308,"bytes_in":0,"bytes_out":0,"client":"10.196.5.97"},"10.196.5.97:60412@f1539762-ae1d-4e8d-8dd0-37eb78fab465":{"id":"10.196.5.97:60412@f1539762-ae1d-4e8d-8dd0-37eb78fab465","start_time":1604519637407,"bytes_in":0,"bytes_out":0,"client":"10.196.5.97"},"10.196.5.97:60416@f1539762-ae1d-4e8d-8dd0-37eb78fab465":{"id":"10.196.5.97:60416@f1539762-ae1d-4e8d-8dd0-37eb78fab465","start_time":1604519637417,"bytes_in":0,"bytes_out":0,"client":"10.196.5.97"},"10.196.5.97:60418@f1539762-ae1d-4e8d-8dd0-37eb78fab465":{"id":"10.196.5.97:60418@f1539762-ae1d-4e8d-8dd0-37eb78fab465","start_time":1604519637419,"bytes_in":0,"bytes_out":0,"client":"10.196.5.97"},"10.196.5.97:60442@f1539762-ae1d-4e8d-8dd0-37eb78fab465":{"id":"10.196.5.97:60442@f1539762-ae1d-4e8d-8dd0-37eb78fab465","start_time":1604519637537,"bytes_in":0,"bytes_out":0,"client":"10.196.5.97"},"10.196.5.97:60444@f1539762-ae1d-4e8d-8dd0-37eb78fab465":{"id":"10.196.5.97:60444@f1539762-ae1d-4e8d-8dd0-37eb78fab465","start_time":1604519637541,"bytes_in":0,"bytes_out":0,"client":"10.196.5.97"},"10.196.5.97:60446@f1539762-ae1d-4e8d-8dd0-37eb78fab465":{"id":"10.196.5.97:60446@f1539762-ae1d-4e8d-8dd0-37eb78fab465","start_time":1604519637543,"bytes_in":0,"bytes_out":0,"client":"10.196.5.97"},"10.196.5.97:60450@f1539762-ae1d-4e8d-8dd0-37eb78fab465":{"id":"10.196.5.97:60450@f1539762-ae1d-4e8d-8dd0-37eb78fab465","start_time":1604519637548,"bytes_in":0,"bytes_out":0,"client":"10.196.5.97"}},"egress":{}}
2020-11-04T19:53:57.612Z bridge-server info [skupper-router-68879cd7-cq44r_amqp_nats-cloud-gateway_to_tcp_10.196.5.97_1027] receiver attached
2020-11-04T19:53:57.612Z bridge-server info [skupper-router-68879cd7-cq44r_amqp_nats-cloud-gateway_to_tcp_10.196.5.97_1027] receiver attached
2020-11-04T19:53:57.612Z bridge-server info receiver 10.196.5.97:60420@f1539762-ae1d-4e8d-8dd0-37eb78fab465@skupper-router-68879cd7-cq44r_tcp_1027_to_amqp_nats-cloud-gateway closed for socket 10.196.5.97:60420@f1539762-ae1d-4e8d-8dd0-37eb78fab465
events.js:174
      throw er; // Unhandled 'error' event
      ^
Error: Connectivity to the peer container was lost
    at Receiver.link.on_detach (/opt/app-root/node_modules/rhea/lib/link.js:165:86)
    at Session.on_detach (/opt/app-root/node_modules/rhea/lib/session.js:730:27)
    at Connection.(anonymous function) [as on_detach] (/opt/app-root/node_modules/rhea/lib/connection.js:809:30)
    at c.dispatch (/opt/app-root/node_modules/rhea/lib/types.js:910:33)
    at Transport.read (/opt/app-root/node_modules/rhea/lib/transport.js:109:36)
    at SaslClient.read (/opt/app-root/node_modules/rhea/lib/sasl.js:328:26)
    at Connection.input (/opt/app-root/node_modules/rhea/lib/connection.js:543:35)
    at Socket.emit (events.js:198:13)
    at addChunk (_stream_readable.js:288:12)
    at readableAddChunk (_stream_readable.js:269:11)
Emitted 'error' event at:
    at Container.dispatch (/opt/app-root/node_modules/rhea/lib/container.js:41:33)
    at Connection.dispatch (/opt/app-root/node_modules/rhea/lib/connection.js:261:40)
    at Connection.input (/opt/app-root/node_modules/rhea/lib/connection.js:561:18)
    at Socket.emit (events.js:198:13)
    at addChunk (_stream_readable.js:288:12)
    at readableAddChunk (_stream_readable.js:269:11)
    at Socket.Readable.push (_stream_readable.js:224:10)
    at TCP.onStreamRead [as onread] (internal/stream_base_commons.js:94:17)

I know bridge-server is set to be removed soon, but not sure if this will be an issue with the new strategy.

grs commented 3 years ago

Can you give a bit more detail on your setup?

kungfuchicken commented 3 years ago

that is coming from a cloud install in GKE speaking connected with edge clusters using k3s. I've been able to continue sending messages back and forth even with the container down, but technically the deploy created by the skupper-site-controller (used in both clusters) does not have minimum number of replicas in the replicaset.

kungfuchicken commented 3 years ago

Pod goes into CrashLoopBackoff.

grs commented 3 years ago

Which version of skupper is this with (skupper version should tell you)?

kungfuchicken commented 3 years ago

Skupper client reports 0.3.2. I Double checked the image used in the pod, too. Since the pod was deployed via the controller, I figured it might be whatever the controller was). The controller is at 0.3.2:

        - name: SKUPPER_SERVICE_CONTROLLER_IMAGE
          value: quay.io/skupper/service-controller:0.3.2
        image: quay.io/skupper/site-controller:0.3.2

the pod container for the bridge-server reports image: quay.io/skupper/bridge-server:0.3

kungfuchicken commented 3 years ago

seems like this issue is a non-issue once https://github.com/skupperproject/skupper/pull/284 is done, so please, don't let this be a distraction :D just trying to make sure to provide useful bug info.

Kampe commented 3 years ago

Seeing this issue as well with v0.3.2, bridge server on the "hub" side of things is stuck in CrashLoopBackOff. Edge bridge servers are seemingly fine and are opening and closing TCP connections seemingly reliably. The router container is still in working order. However HTTP connections are completely broken in this state.

For more information, we're creating the skupper connection via both the "hub" and the "edge" site via the Skupper Site Controller and this yaml:

hub

apiVersion: v1
kind: ConfigMap
metadata:
  name: skupper-site
data:
  cluster-local: "false"
  console: "true"
  console-authentication: internal
  console-password: "barney"
  console-user: "rubble"
  edge: "false"
  name: test-cloud
  router-console: "true"
  service-controller: "true"
  service-sync: "true"

edge

apiVersion: v1
kind: ConfigMap
metadata:
  name: skupper-site
data:
  cluster-local: "false"
  console: "true"
  console-authentication: internal
  console-password: "barney"
  console-user: "rubble"
  edge: "true"
  name: test-edge
  router-console: "true"
  service-controller: "true"
  service-sync: "true"

I believe this may warrant higher priority if the removal of bridge server isn't the issue.

ajssmith commented 3 years ago

Closing as bridge server has been deprecated.