Open mga-chka opened 1 year ago
So after looking into this, here are some of my notes and suggestions (documenting here what was discussed offline):
As discussed we will start utilising the low level ch-go library to help with the protocol implementation. This will speed up an initial implementation and we can always move away if we run into limitations (or upstream a change to ch-go).
A couple of points to note:
One alternative approach is to go the full Proxy route (TCP -> TCP). That means we need very minimal protocol understanding. Only hello for authentication and understanding the query request, while for the query response we just need to figure out how much blocks we need to read (which requires decoding but not reconstructing a full response). That allows us to focus more on how to properly handle the TCP connections first before spending a lot of time on HTTP -> TCP translation.
On the TCP -> TCP protocol, that means we need to change the proposed configuration. We need to validate that only the TCP server connects with the TCP URL of Clickhouse.
One potential solution is to turn clusters from a list of addresses to structs with: Host
, HTTP port
and TCP port
. That would allow each server to chose what is available (and we can validate whether the TCP port is set if a TCP server is configured or just throw an error/use the default port if it isn't set).
However this would be a backwards incompatible change. Given how big of a change the TCP feature will introduce I don't think it is bad to introduce a backwards incompatible change.
For the TCP -> TCP diagram we will have to handle the following flows, starting with authentication:
sequenceDiagram
client->>chproxy: ClientHello
chproxy->>chproxy: Determine User/Password from configuration
chproxy->>clickhouse: ClientHello
clickhouse->>chproxy: ServerHello
chproxy->>chproxy: Determine protocol revision and other relevant metadata
chproxy->>client: ServerHello, forward right protocol revision and metadata, with chproxy server name
Next we can work on the Query (starting with SELECTs). Note for Inserts we can have data blocks coming from the client.
sequenceDiagram
client->>chproxy: Query
client->>chproxy: Empty Data Block
chproxy->>chproxy: Check Query user (based on TCP client session) and determine settings
chproxy->>clickhouse: Query (kill based on timeout)
clickhouse->>chproxy: Data Blocks (with Query response and Query Metadata)
clickhouse->>chproxy: Data End of Stream
chproxy->>chproxy: Cache Data Blocks
chproxy->>client: Data Blocks
Note that these are both happy flow diagrams. I didn't include e.g. killed queries.
Additionally there is a question about how to respond to clients. Do we stream the response already back to the client and cache asynchronously (cleaning the cache if we encounter a ClickHouse exception). Or do we wait for all the data to be available before we respond to the client.
IMO first approach would be preferred. I don't think we should wait for the full response to start sending data back, that will make it easier to avoid overloading chproxy memory as well.
I think we would also benefit from some good abstractions over the Data Stream in the TCP protocol. E.g. an iterator pattern to deal with the different blocks/types in the protocol.
Note that Queries might not work with such a simple interface, as we could recieve multiple metadata responses during a query (e.g. profile info, logs, query progress). See for example clikhouse-go processing of data blocks
package protocol
type ProtocolIterator interface {
HasNext() bool
GetNextTyped(expected ProtocolCode) interface{}
}
Maybe a state machine would be better suited for this type of problem?
Also as this will be quite a lot of effort (and especially if we make a backwards compatible change to the configuration), should we consider creating a seperate branch so we can still make fixes/contribute to master while we work on TCP support?
IMO first approach would be preferred. I don't think we should wait for the full response to start sending data back, that will make it easier to avoid overloading chproxy memory as well.
We should use the same logic as the one we're using for the http protocole:
Also as this will be quite a lot of effort (and especially if we make a backwards compatible change to the configuration), should we consider creating a seperate branch so we can still make fixes/contribute to master while we work on TCP support?
One tradeoff would be a small refacto on the current codebase so that we can add the tcp logic in specific files and iterate without the risk the break something on TCP. Conceptually, what we're doing with TCP is the same as at we're doing with HTTP . It's just that in TCP case, we need to handle the implementation of the protocol whereas with HTTP it's hidden by the httputil.ReverseProxy interface, so if we can implement an tcputil.ReverseProxy interface it might be enough (more or less, of course we will face some limitations regarding some features but it can be a starting point)
FYI I did a simple prototype to help on the design, feel free to play with it and modify it: https://github.com/mga-chka/tcpproxy
The aim of this feature is to make chproxy work with both http and tcp.
There are 2 aspects of this feature:
Both tcp connections should be turn on/off with new variables in the conf file:
At the end of step 1-e), we should be able to use to command line tool clickhouse-client and connect to chproxy (in secure mode if we can) to do select queries:
clickhouse-client --host <CHPROXY_IP> --secure --port <CHPROXY_PORT> --user <USERNAME> --password <PASSWORD>
At the end of step 3), we should be able to do insert queries.This is a big task that should be done in multiple steps: 0) understand how the clickhouse TCP protocol works We can't just use an existing go TCP client for clickhouse (like ch-go) because we need to be able to mimick the behaviour of a clickhouse server so that the chroxy clients believe they're talking with clickhouse. Therefore, we need to understand how the protocol works and implement it.
This link gives the workflows when -establishing a connection, -sending a read query, -sending a write query https://github.com/housepower/ClickHouse-Native-JDBC/blob/21cbb5ebab0a5cab54174e049c268ab8bc6da032/docs/deep-dive/native_protocol.md
In a few words, a client can send 6 types of messages: -Hello = 0, used to establish a connections and check the version of the protocol of the client and the server https://github.com/ClickHouse/ClickHouse/blob/master/src/Server/TCPHandler.cpp#L1123 -Query = 1, used to send a query (maybe contains Query id & query settings) https://github.com/ClickHouse/ClickHouse/blob/master/src/Server/TCPHandler.cpp#L1386 -Data = 2, used to send data to clickhouse (mainly for the insert queries) -Cancel = 3, used to cancel a query -Ping = 4, used to check if the connection to the server is alive. -KeepAlive = 6, used to keep the connection alive (not sure if we need it in chproxy)
FYI, here are the other messages the client can send, but they're mainly used internally by clickhouse to send messages between shards (full list available here https://github.com/ClickHouse/ClickHouse/blob/master/src/Core/Protocol.h#L134): -TablesStatusRequest = 5, -Scalar = 7, -IgnoredPartUUIDs = 8, -ReadTaskResponse = 9, -MergeTreeReadTaskResponse = 10
The server can send back X types of messages: -Hello = 0, the response to the hello query -Data = 1, used to send data (for example the result of a query) -Exception = 2, sent if something happened during the request -Progress = 3, query execution progress: rows read, bytes read [we will not implement it in this PR because it might require a huge refactoring] -Pong = 4, Ping response -EndOfStream = 5, sent at the end of the response -ProfileInfo = 6, a Packet with profiling info (nb: not sure if it's mandatory first the first version, if not we will not implement it at first) -Totals = 7, an option that can be asked by the client (with the SQL clause
WITH TOTAL
): nb we will not implement it in this PR -Extremes = 8, an option that can be asked by the client (with the settingsextremes
): nb we will not implement it in this PR -TablesStatusResponse = 9, an option that can be asked by the client: nb we will not implement it in this PR -Log = 10, use to show the logs of the query execution: nb we will not implement it in this PRFYI, Here are the other messages the client can send but they're mainly used internally by clickhouse to send messages between shards (full list available here https://github.com/ClickHouse/ClickHouse/blob/master/src/Core/Protocol.h#L67) -TableColumns = 11, -PartUUIDs = 12, -ReadTaskRequest = 13, -ProfileEvents = 14, -MergeTreeReadTaskRequest = 15,
The protocol changes every 2-6 months on average. Most changes are only for the inner-logic between shards: CF https://github.com/ClickHouse/ClickHouse/blame/master/src/Core/ProtocolDefines.h The protocol is backward compatible so if we do it right by mimicking the last protocol version. But, in order to add the new clickhouse features that rely on the protocol, we might need to update CHProxy. Nb: We will need to set the client_tcp_protocol_version we will use for both the client and clickhouse, we will take the latest one from clickhouse when we start the developments (definied in ProtocolDefines.h).
1) the first big milestone is to be able to handle query only queries (i.e. select). This tasks can be devided as follow:
1-a) maintain the HTTP interface for clients and communicate with clickhouse using TCP Here is the Workflow: if chproxy is configured with TCP, everytime we get an http query, we create a tcp connection to clickhouse, send the query in binary,get an answer then sent it back to the http caller nb: (no caching or connexion pooling at this step)
1-b) add caching abilities
1-c) add settings stored in the http query params to the TCP connexion
1-d) add a pool of TCP connections add a pool of TCP connections (for each clickhouse shards & for the clients) to avoid creating connections every time Warning: we should be careful if a previous query modified a setting in a tcp connection to clickhouse, for example putting the max_execution_time to 5 sec. In this case, we should either drop the connection or have a way to reset all the settings.
1-e) [read query only] provide a TCP interface for clients and communicate with clikhouse using TCP
1-f) ability to cancel a query when the client stops its http/tcp connection or ask to cancel the query
1-g) handle the ping query whether a client in tcp or http triggers ping, we need to be able to send a ping to clickhouse in tcp
1-h) [maybe optional] add a TLS layer:
1-i) [optional] implement some of the missing features of the TCP protocol like the Progress msgs, the totals msg, the extremes, ...
2) do benchmarks on TCP vs HTTP for select queries (and put the results in the doc)
3) make the TCP protocol work for write queries
4) do benchmarks on TCP vs HTTP for insert queries (and put the results in the doc)