ory / keto

The most scalable and customizable permission server on the market. Fix your slow or broken permission system with Google's proven "Zanzibar" approach. Supports ACL, RBAC, and more. Written in Go, cloud native, headless, API-first. Available as a service on Ory Network and for self-hosters.
https://www.ory.sh/?utm_source=github&utm_medium=banner&utm_campaign=keto
Apache License 2.0
4.87k stars 345 forks source link

Multiplexing HTTP and gRPC with cmux does not work with Istio #854

Open nickjn92 opened 2 years ago

nickjn92 commented 2 years ago

Preflight checklist

Describe the bug

All gRPC requests sent to the keto pod on kubernetes (using the keto-write service) seems to be handled by the HTTP Handler instead of the gRPC handler.

After numerous attempts trying to get it to work on kubernetes together with Istio, I still cant it to work and i think the culprit is https://github.com/ory/keto/blob/ef103eb231c6ca466d7c928ca73b29bf3d4c23d1/internal/driver/daemon.go#L189

It seems like Istio can multiplex both http2 and grpc requests over the same connection while cmux uses the connection and this might be the reason for the observed results, e.g. some previous plain http2 request is sent and cmux matches to HTTP1, then the same connection is reused for gRPC requests and they also get routed to HTTP1 handler

Possible workarounds:

Reproducing the bug

  1. Create a kubernetes cluster (using kind, minikube, or even on GKE)
  2. Install istio 1.12
  3. Deploy keto to ory namespace
  4. Deploy custom application to any other namespace
  5. Both namespaces have istio-injection: enabled to ensure proxies are created
  6. Send gRPC request from custom application to keto-write.ory:80
  7. HTTP handler handles request instead of gRPC

Relevant log output

time=2022-03-14T18:44:53Z level=info msg=started handling request http_request=map[headers:map[content-type:application/grpc grpc-accept-encoding:gzip te:trailers user-agent:grpc-java-netty/1.44.0 x-b3-parentspanid:52145d659ec72f10 x-b3-sampled:0 x-b3-spanid:3f331637fd61f3ce x-b3-traceid:fc7685fc00116c1c52145d659ec72f10 x-envoy-attempt-count:1 x-forwarded-client-cert:By=spiffe://cluster.local/ns/ory/sa/ory-ksa;Hash=a814987f150b3d35acccfdee792e9b30c2ad4055e450b358402ed5d98f6a1714;Subject="";URI=spiffe://cluster.local/ns/iam/sa/default x-forwarded-proto:http x-request-id:c07a215e-7c89-4755-9b17-3ba7ab95deb7] host:keto-write.ory:80 method:POST path:/ory.keto.acl.v1alpha1.WriteService/TransactRelationTuples query:<nil> remote:127.0.0.6:51519 scheme:http]
time=2022-03-14T18:44:53Z level=info msg=completed handling request http_request=map[headers:map[content-type:application/grpc grpc-accept-encoding:gzip te:trailers user-agent:grpc-java-netty/1.44.0 x-b3-parentspanid:52145d659ec72f10 x-b3-sampled:0 x-b3-spanid:3f331637fd61f3ce x-b3-traceid:fc7685fc00116c1c52145d659ec72f10 x-envoy-attempt-count:1 x-forwarded-client-cert:By=spiffe://cluster.local/ns/ory/sa/ory-ksa;Hash=a814987f150b3d35acccfdee792e9b30c2ad4055e450b358402ed5d98f6a1714;Subject="";URI=spiffe://cluster.local/ns/iam/sa/default x-forwarded-proto:http x-request-id:c07a215e-7c89-4755-9b17-3ba7ab95deb7] host:keto-write.ory:80 method:POST path:/ory.keto.acl.v1alpha1.WriteService/TransactRelationTuples query:<nil> remote:127.0.0.6:51519 scheme:http] http_response=map[headers:map[content-type:text/plain; charset=utf-8 x-content-type-options:nosniff] size:19 status:404 text_status:Not Found took:205.347µs]

// Debugging proxy to verify that headers received are HTTP2
2022-03-14T21:12:30.340519Z debug   envoy http  [C4577][S10006260048823567630] request headers complete (end_stream=false):
':authority', 'keto-write-grpc.ory.svc.cluster.local:4467'
':path', '/ory.keto.acl.v1alpha1.WriteService/TransactRelationTuples'
':method', 'POST'
'content-type', 'application/grpc'
'te', 'trailers'
'user-agent', 'grpc-java-netty/1.44.0'
'grpc-accept-encoding', 'gzip'
'x-request-id', 'ef779e17-846f-4c8e-9d1c-90b8b6a68061'
'x-envoy-decorator-operation', 'keto-write.ory.svc.cluster.local:4467/*'
'x-envoy-peer-metadata', '<redacted>=='
'x-envoy-peer-metadata-id', 'sidecar~10.60.0.40~auth-provisioner-6c6bf845cd-rhv68.iam~iam.svc.cluster.local'
'x-envoy-attempt-count', '1'
'x-b3-traceid', '9901ce1f2b2bc3ada67eed3942baf2a4'
'x-b3-spanid', 'a67eed3942baf2a4'
'x-b3-sampled', '0'
'x-forwarded-proto', 'https'
'transfer-encoding', 'chunked'

Relevant configuration

serve:
  read:
    port: 4466
    host: 0.0.0.0
  write:
    port: 4467
    host: 0.0.0.0

Version

v0.8.0-alpha.0

On which operating system are you observing this issue?

Linux

In which environment are you deploying?

Kubernetes

Additional Context

No response

noyoshi commented 2 years ago

+1 on this, experiencing the same issue right now. @nickjn92 did modifying the service names fix the issue?

zepatrik commented 2 years ago

Multiplexing will be dropped with #1091