cube-js / cube

📊 Cube — The Semantic Layer for Building Data Applications
https://cube.dev
Other
17.71k stars 1.75k forks source link

HTTP/1.1 400 Bad Request when using envoy TLS proxy sidecar to encrypt traffic between cubestore-router and cubestore-worker #7367

Open murphymagic opened 10 months ago

murphymagic commented 10 months ago

I am deploying cubejs into a kubernetes cluster, as described here

In order to encrypt pod to pod traffic, I am deploying envoy proxy sidecars in each pod to enable TLS, as described here.

This approach works fine for cube-api to cubestore-router pod traffic and cube-refresh-worker to cubestore-router traffic. Additionally, this approach works for traffic from cube pods to non-cube pods, such as keycloak and ksqldb.

However, this approach fails for traffic between cubestore-router and cubestore-worker. The cubestore-router logs are full or the following errors:

2023-11-01 14:25:54,012 ERROR [cubestore::http] <pid:1> Error processing HTTP command: Corrupted message received. Please check your worker and meta connection environment variables.

2023-11-01 14:25:56,550 ERROR [cubestore::util] <pid:1> Error during ChunkProcessing: CubeError { message: "Corrupted message received. Please check your worker and meta connection environment variables.", backtrace: "", cause: User }
2023-11-01 14:25:57,077 ERROR [cubestore::http] <pid:1> Error processing HTTP command: Corrupted message received. Please check your worker and meta connection environment variables.

2023-11-01 14:25:58,169 ERROR [cubestore::scheduler] <pid:1> Error processing event UpdateJob(IdRow { id: 32, row: Job { row_reference: Table(Tables, 17), job_type: TableImportCSV("stream://default/prod_pre_aggregations-k_asset_asset_details_rollup_nxbppwn1_gybfkt1_1ik4nth/0"), last_heart_beat: 2023-11-01T14:25:57.406644645Z, status: ProcessingBy("dbg-cubestore-worker-0.dbg-cubestore-worker-headless:9001") } }, IdRow { id: 32, row: Job { row_reference: Table(Tables, 17), job_type: TableImportCSV("stream://default/prod_pre_aggregations-k_asset_asset_details_rollup_nxbppwn1_gybfkt1_1ik4nth/0"), last_heart_beat: 2023-11-01T14:25:58.164603958Z, status: Error("Stale stream timeout: deadline has elapsed") } }): Corrupted message received. Please check your worker and meta connection environment variables.
2023-11-01 14:26:00,165 ERROR [cubestore::http] <pid:1> Error processing HTTP command: Corrupted message received. Please check your worker and meta connection environment variables.

The envoy logs show the following error: HTTP/1.1 400 DPE 0 11 http1.codec_error HPE_INVALID_METHOD (DPE = Downstream Protocol Error. Downstream = cubestore-router).

Attached is a screenshot of a tcpdump of traffic, if it sheds any light on what is going on.. Screenshot 2023-10-24 at 12 37 46 PM

paveltiunov commented 10 months ago

@murphymagic It seems you're using HTTP proxy, and transport between workers and router is non-HTTP binary transport. If you're looking into this due to compliance reasons, you might want to consider the BYOC Cube Cloud option, as it has a transport encryption option.