kumahq / kuma

🐻 The multi-zone service mesh for containers, Kubernetes and VMs. Built with Envoy. CNCF Sandbox Project.
https://kuma.io/install
Apache License 2.0
3.66k stars 333 forks source link

Panic when calling `/meshes/default/dataplanes/mycurlpod.kuma-demo/rules` endpoint #11194

Closed Automaat closed 1 month ago

Automaat commented 2 months ago

What happened?

This happened with MeshService enabled and MeshTimeout targeting real MeshService applied

2024-08-23T09:07:42.673Z    INFO    api-server  http: panic serving 10.128.0.34:52839: missing kuma.io/service tag
goroutine 24100 [running]:
net/http.(*conn).serve.func1()
    net/http/server.go:1903 +0xbe
panic({0x34b1a20?, 0xc00438d420?})
    runtime/panic.go:770 +0x132
github.com/kumahq/kuma/pkg/core/xds/inspect.getOutboundRuleAttachments({0x7212d00, 0x0, 0x3c32b4b?}, 0xc002bcb3b0, {0x3c32b4b, 0xc}, 0xc000e46758)
    github.com/kumahq/kuma@v0.0.0-20240822133403-a37576517ec7/pkg/core/xds/inspect/rules.go:111 +0xaee
github.com/kumahq/kuma/pkg/core/xds/inspect.BuildRulesAttachments(0xc0039a6030, 0xc002bcb3b0, {0xc0024c2270, 0x5, 0xc004251e30?})
    github.com/kumahq/kuma@v0.0.0-20240822133403-a37576517ec7/pkg/core/xds/inspect/rules.go:48 +0x3e6
github.com/kumahq/kuma/pkg/api-server.addInspectEndpoints.inspectRulesAttachment.func2(0xc00365e000, 0xc000554000)
    github.com/kumahq/kuma@v0.0.0-20240822133403-a37576517ec7/pkg/api-server/inspect_endpoints.go:571 +0x248
github.com/emicklei/go-restful/v3.(*FilterChain).ProcessFilter(0x4ab7130?, 0xc003b18180?, 0x352e900?)
    github.com/emicklei/go-restful/v3@v3.12.1/filter.go:23 +0x53
github.com/kumahq/kuma/pkg/api-server.NewApiServer.func1(0xc00365e000, 0xc000554000, 0xc0039820f0)
    github.com/kumahq/kuma@v0.0.0-20240822133403-a37576517ec7/pkg/api-server/server.go:237 +0x2eb
github.com/emicklei/go-restful/v3.(*FilterChain).ProcessFilter(0xc003b18030?, 0x3c24373?, 0xc000d00610?)
    github.com/emicklei/go-restful/v3@v3.12.1/filter.go:21 +0x42
github.com/emicklei/go-restful/v3.CrossOriginResourceSharing.Filter({{0xc000b9bb50, 0x1, 0x1}, {0x0, 0x0, 0x0}, {0xc000c012c0, 0x1, 0x1}, 0x0, ...}, ...)
    github.com/emicklei/go-restful/v3@v3.12.1/cors_filter.go:53 +0xbf
github.com/emicklei/go-restful/v3.(*FilterChain).ProcessFilter(0x4ab7130?, 0xc003b18180?, 0x60?)
    github.com/emicklei/go-restful/v3@v3.12.1/filter.go:21 +0x42
github.com/kumahq/kuma/pkg/plugins/authn/api-server/tokens.(*plugin).NewAuthenticator.UserTokenAuthenticator.func1(0xc00365e000, 0xc000554000, 0xc0039820f0)
    github.com/kumahq/kuma@v0.0.0-20240822133403-a37576517ec7/pkg/plugins/authn/api-server/tokens/authenticator.go:38 +0x4c7
github.com/emicklei/go-restful/v3.(*FilterChain).ProcessFilter(0xc001604690?, 0x4ab7168?, 0xc003982000?)
    github.com/emicklei/go-restful/v3@v3.12.1/filter.go:21 +0x42
go.opentelemetry.io/contrib/instrumentation/github.com/emicklei/go-restful/otelrestful.OTelFilter.func1(0xc00365e000, 0xc000554000, 0xc0039820f0)
    go.opentelemetry.io/contrib/instrumentation/github.com/emicklei/go-restful/otelrestful@v0.53.0/restful.go:68 +0xa11
github.com/emicklei/go-restful/v3.(*FilterChain).ProcessFilter(0x0?, 0xc00358d6c8?, 0x49d68f?)
    github.com/emicklei/go-restful/v3@v3.12.1/filter.go:21 +0x42
github.com/kumahq/kuma/pkg/api-server.NewApiServer.MetricsHandler.func2.1()
    github.com/kumahq/kuma@v0.0.0-20240822133403-a37576517ec7/pkg/util/prometheus/gorestful_middleware.go:21 +0x1f
github.com/slok/go-http-metrics/middleware.Middleware.Measure({{{0x4aa8a70, 0xc00159b050}, {0x0, 0x0}, 0x0, 0x0, 0x0}}, {0x0, 0x0}, {0x4abc700, ...}, ...)
    github.com/slok/go-http-metrics@v0.12.0/middleware/middleware.go:117 +0x302
github.com/kumahq/kuma/pkg/api-server.NewApiServer.MetricsHandler.func2(0xc00365e000, 0xc000554000, 0xc0039820f0)
    github.com/kumahq/kuma@v0.0.0-20240822133403-a37576517ec7/pkg/util/prometheus/gorestful_middleware.go:20 +0x118
github.com/emicklei/go-restful/v3.(*FilterChain).ProcessFilter(0xc001a0a008?, 0x4aa54b0?, 0xc0007e28c0?)
    github.com/emicklei/go-restful/v3@v3.12.1/filter.go:21 +0x42
github.com/emicklei/go-restful/v3.(*Container).dispatch(0xc0004c61b0, {0x4aa54b0, 0xc0007e28c0}, 0xc0037c8120)
    github.com/emicklei/go-restful/v3@v3.12.1/container.go:296 +0x9e5
net/http.HandlerFunc.ServeHTTP(0xc000062820?, {0x4aa54b0?, 0xc0007e28c0?}, 0x81583a?)
    net/http/server.go:2171 +0x29
net/http.(*ServeMux).ServeHTTP(0x472b79?, {0x4aa54b0, 0xc0007e28c0}, 0xc0037c8120)
    net/http/server.go:2688 +0x1ad
net/http.serverHandler.ServeHTTP({0xc003a12480?}, {0x4aa54b0?, 0xc0007e28c0?}, 0x6?)
    net/http/server.go:3142 +0x8e
net/http.(*conn).serve(0xc0041d7560, {0x4ab7130, 0xc001b121b0})
    net/http/server.go:2044 +0x5e8
created by net/http.(*Server).Serve in goroutine 54
    net/http/server.go:3290 +0x4b4
jijiechen commented 1 month ago

I wasn't able to reproduce it at this time. I installed the counter demo app and created the following policies, when I try the URL endpoint, it succeeded without any issues.

apiVersion: kuma.io/v1alpha1
kind: Mesh
metadata:
  labels:
    kuma.io/env: kubernetes
    kuma.io/origin: zone
    kuma.io/zone: default
  name: default
spec:
  meshServices:
    enabled: Everywhere
apiVersion: kuma.io/v1alpha1
kind: MeshTimeout
metadata:
  labels:
    k8s.kuma.io/namespace: kuma-demo
    kuma.io/env: kubernetes
    kuma.io/mesh: default
    kuma.io/origin: zone
    kuma.io/policy-role: consumer
    kuma.io/zone: default
  name: timeout-on-demo-app
  namespace: kuma-demo
spec:
  targetRef:
    kind: Mesh
  to:
  - default:
      connectionTimeout: 10s
      http:
        requestTimeout: 5s
        streamIdleTimeout: 1h0m0s
    targetRef:
      kind: MeshService
      name: demo-app
      namespace: kuma-demo
lobkovilya commented 1 month ago

Let's look at the code and see what was the reason.

Automaat commented 1 month ago

It was this: https://github.com/kumahq/kuma/blob/fc68afc736beb477c272ac13e86b46ea3e4a76b0/pkg/core/xds/inspect/rules.go#L109-L113 Is is still the case? Is it true for MeshService? I was unable to reproduce the panic locally

lobkovilya commented 1 month ago

I managed to reproduce this on Universal when Dataplane.Networking.Outbound[].backendRef is set. In this case, outbound doesn't have tags and that's why we get panic.