ContentSquare / chproxy

Open-Source ClickHouse http proxy and load balancer
https://www.chproxy.org/
MIT License
1.28k stars 259 forks source link

[BUG] host header on requests to ClickHouse #415

Open adiletkabylbekov opened 6 months ago

adiletkabylbekov commented 6 months ago

We have env where connection to ClickHouse node goes throw NGINX reverse proxy, with custom server_name (virtual host) and acl on that, so CHProxy sends user requests throw that NGINX reverse proxy. CHProxy config:

server:
  http:
    listen_addr: "0.0.0.0:8080"
users:
- name: "user"
  password: "pass"
  to_user: "cl_user"
  to_cluster: "example_com"
clusters:
- name: "example_com"
  nodes: [ "example.com:80" ]
  heartbeat:
    request: "/?query=SELECT%201"
    response: "1\n"
  users:
  - name: "cl_user"
    password: "passs"

Faced behaviour when CHProxy sets invalid host header when serving http query to CH node, it's just resend header from user request

Example:

curl -v -d 'select now()' -X POST -H "X-ClickHouse-User: user" -H "X-ClickHouse-Key: pass" "http://chproxy.com:8080"
* About to connect() to chproxy.com port 8080 (#0)
*   Trying 10.10.10.4...
* Connected to chproxy.com (10.10.10.4) port 8080 (#0)
> POST / HTTP/1.1
> User-Agent: curl/7.29.0
> Host: chproxy.com:8080
> Accept: */*
> X-ClickHouse-User: user
> X-ClickHouse-Key: pass
> Content-Length: 12
> Content-Type: application/x-www-form-urlencoded
> 
* upload completely sent off: 12 out of 12 bytes
< HTTP/1.1 403 Forbidden
< Content-Length: 169
< Content-Type: text/html
< Date: Thu, 28 Mar 2024 08:09:10 GMT
< Server: nginx/1.12.1
< X-Cache: MISS
< 
<html>
<head><title>403 Forbidden</title></head>
<body bgcolor="white">
<center><h1>403 Forbidden</h1></center>
<hr><center>nginx/1.12.1</center>
</body>
</html>
* Connection #0 to host chproxy.com left intact

such request is rejected by NGINX alc because it was going to default virtual host (where another acl was configured), not to configured server_name, because 'host header' was not matched

if we added required host header to request to CHProxy, query was successfully maintained by CH node, cause it was passed all configs and acl on NGINX reverse proxy

curl -v -H "Host: example.com" -d 'select timezone()' -X POST -H "X-ClickHouse-User: user" -H "X-ClickHouse-Key: pass" "http://chproxy.com:8080"
* About to connect() to chproxy.com port 8080 (#0)
*   Trying 10.10.10.4...
* Connected to chproxy.com (10.10.10.4) port 8080 (#0)
> POST / HTTP/1.1
> User-Agent: curl/7.29.0
> Accept: */*
> Host: example.com
> X-ClickHouse-User: user
> X-ClickHouse-Key: pass
> Content-Length: 17
> Content-Type: application/x-www-form-urlencoded
> 
* upload completely sent off: 17 out of 17 bytes
< HTTP/1.1 200 OK
< Cache-Control: max-age=7200
< Content-Length: 14
< Content-Type: text/tab-separated-values; charset=UTF-8
< Date: Thu, 28 Mar 2024 07:06:43 GMT
< Server: nginx/1.12.1
< X-Cache: MISS
< X-Clickhouse-Format: TabSeparated
< X-Clickhouse-Query-Id: 17C09936B0D3ACBF
< X-Clickhouse-Server-Display-Name: ch3.i
< X-Clickhouse-Summary: {"read_rows":"1","read_bytes":"1","written_rows":"0","written_bytes":"0","total_rows_to_read":"0","result_rows":"0","result_bytes":"0"}
< X-Clickhouse-Timezone: Asia/Bishkek
< 
Asia/Bishkek
* Connection #0 to host 10.10.10.4 left intact

I guess need to rewrite host header on requests from CHProxy to ClickHouse to proper value

mga-chka commented 6 months ago

Hi, did you look at these reverse proxy settings ?

adiletkabylbekov commented 6 months ago

Hi, did you look at these reverse proxy settings ?

that setting is related to env where chproxy behind proxy, our case is request to ClickHouse goes throw proxy it's not same things

mga-chka commented 6 months ago

my bad, I read the issue to face. In this case. This issue is quite specific and the core maintainers don't have a lot of time to work on chproxy. So, feel free to add the feature in the code and we'll review the PR.