ClickHouse / clickhouse-js

Official JS client for ClickHouse DB
https://clickhouse.com
Apache License 2.0
205 stars 25 forks source link

HTTP socket closes even when keep_alive is used #202

Closed varrocs closed 11 months ago

varrocs commented 11 months ago

Describe the bug

When using insert() with a keep_alive connection and the async_insert option the socket is closed after each insert request.

Expected behaviour

The socket is kept open and it is reused for the next insert.

Code example

const { createClient } = require('@clickhouse/client');

//create table test_table(`a` UInt16, `b` UInt16) ENGINE=MergeTree PRIMARY KEY a;

const TABLE = process.env.DOMAIN_REPORT_TABLE ?? 'test_table';

const client = createClient({
  host: process.env.CLICKHOUSE_HOST ?? 'http://localhost:18123',
  username: process.env.CLICKHOUSE_USER ?? 'default',
  password: process.env.CLICKHOUSE_PASSWORD,
  keep_alive: {
    enabled: true,
    socket_ttl: 600_000,
    retry_on_expired_socket: true,
  },
  async_insert: true, // I know this is redundant
  clickhouse_settings: {
    async_insert: true,
  },
  // To make sure inserts happen on the same connection
  max_open_connections: 1,
})

const insert = ( params ) => {
  void client.insert(
    {
      table: TABLE,
      format: 'JSONEachRow',
      values: params,
      async_insert: true, // I know, I know, redundancy
      clickhouse_settings: {
        async_insert: true,
      },
      keep_alive: {
        enabled: true,
        socket_ttl: 600_000,
        retry_on_expired_socket: true,
      },
    });
};

const wait = (ms) => new Promise((resolve) => setTimeout(resolve, ms) );

const main = async () => {
  await insert({a: 1, b: 1});
  await wait(1);
  await insert({a: 2, b: 2});
  await wait(1);
  await insert({a: 3, b: 3});
};

void main();

Error log

See the tcpdump outputs below

Configuration

Environment

ClickHouse server

The script that ran the clickhouse server:

docker run -it --rm -p 18123:8123 -p19000:9000 --name some-clickhouse-server --ulimit nofile=262144:262144 clickhouse/clickhouse-server

Additional notes

e want to store a stream of events in ClickHouse. We don't want to lose data when one of our processes is killed or crashed so we want to push the data as soon as possible and use ClickHouse's async_insert feature to create batches for insert. The socket connection process takes significant time so we'd like to reuse an open connection.

I tried to insert a stream like in this example but when the client is killed the data already sent is lost. It's not committed to the DB.

I suspect that the problem is that destroying the stream eventually closes the socket: https://github.com/ClickHouse/clickhouse-js/blob/main/packages/client-node/src/connection/node_base_connection.ts#L378

I guess the socket shouldn't be closed but somehow returned to the HTTP agent's free socket pool When I commented out this line the socket doesn't get closed with an RST flag but it times out after 3 seconds and gets closed by the server with a FIN flag.

So that gave me an idea to drain the stream instead of closing it so I made this change:

        //stream.destroy();
        stream.pipe(fs.createWriteStream("/dev/null")); // Obviously it's not platform independent but works and the draining should be awaited

And this works! The socket is not closed, but reused. The data is written to the database even when I kill the client while streaming.

So I suggest to drain stream instead of closing it. Please note that I haven't tried it without keepalive or without async insert.

I write a proper patch if you don't have the resources.

tcpdump output - original

Note the

[R.]
[S]
[S.]
[.]

sequences

The socket close is always initiated by the client lib side I used wireshark but that's hard to include in a bug report.

14:48:22.574071 IP localhost.18123 > localhost.36238: Flags [S.], seq 1426716164, ack 1514971420, win 65483, options [mss 65495,sackOK,TS val 1268033896 ecr 1268033896,nop,wscale 7], length 0
14:48:22.574087 IP localhost.36238 > localhost.18123: Flags [.], ack 1, win 512, options [nop,nop,TS val 1268033896 ecr 1268033896], length 0
14:48:22.574394 IP localhost.36238 > localhost.18123: Flags [P.], seq 1:322, ack 1, win 512, options [nop,nop,TS val 1268033897 ecr 1268033896], length 321
14:48:22.574407 IP localhost.18123 > localhost.36238: Flags [.], ack 322, win 510, options [nop,nop,TS val 1268033897 ecr 1268033897], length 0
14:48:22.574611 IP localhost.36238 > localhost.18123: Flags [P.], seq 322:327, ack 1, win 512, options [nop,nop,TS val 1268033897 ecr 1268033897], length 5
14:48:22.574622 IP localhost.18123 > localhost.36238: Flags [.], ack 327, win 510, options [nop,nop,TS val 1268033897 ecr 1268033897], length 0
14:48:22.895644 IP localhost.18123 > localhost.36238: Flags [P.], seq 1:572, ack 327, win 512, options [nop,nop,TS val 1268034218 ecr 1268033897], length 571
14:48:22.895659 IP localhost.36238 > localhost.18123: Flags [.], ack 572, win 508, options [nop,nop,TS val 1268034218 ecr 1268034218], length 0
14:48:22.895675 IP localhost.18123 > localhost.36238: Flags [P.], seq 572:574, ack 327, win 512, options [nop,nop,TS val 1268034218 ecr 1268034218], length 2
14:48:22.895685 IP localhost.36238 > localhost.18123: Flags [.], ack 574, win 508, options [nop,nop,TS val 1268034218 ecr 1268034218], length 0
14:48:22.895763 IP localhost.18123 > localhost.36238: Flags [P.], seq 574:579, ack 327, win 512, options [nop,nop,TS val 1268034218 ecr 1268034218], length 5
14:48:22.895773 IP localhost.36238 > localhost.18123: Flags [.], ack 579, win 508, options [nop,nop,TS val 1268034218 ecr 1268034218], length 0
14:48:22.897521 IP localhost.36238 > localhost.18123: Flags [R.], seq 327, ack 579, win 512, options [nop,nop,TS val 1268034220 ecr 1268034218], length 0
14:48:22.898668 IP localhost.36250 > localhost.18123: Flags [S], seq 3533489188, win 65495, options [mss 65495,sackOK,TS val 1268034221 ecr 0,nop,wscale 7], length 0
14:48:22.898686 IP localhost.18123 > localhost.36250: Flags [S.], seq 1874316382, ack 3533489189, win 65483, options [mss 65495,sackOK,TS val 1268034221 ecr 1268034221,nop,wscale 7], length 0
14:48:22.898700 IP localhost.36250 > localhost.18123: Flags [.], ack 1, win 512, options [nop,nop,TS val 1268034221 ecr 1268034221], length 0
14:48:22.898773 IP localhost.36250 > localhost.18123: Flags [P.], seq 1:322, ack 1, win 512, options [nop,nop,TS val 1268034221 ecr 1268034221], length 321
14:48:22.898784 IP localhost.18123 > localhost.36250: Flags [.], ack 322, win 510, options [nop,nop,TS val 1268034221 ecr 1268034221], length 0
14:48:22.898806 IP localhost.36250 > localhost.18123: Flags [P.], seq 322:327, ack 1, win 512, options [nop,nop,TS val 1268034221 ecr 1268034221], length 5
14:48:22.898814 IP localhost.18123 > localhost.36250: Flags [.], ack 327, win 510, options [nop,nop,TS val 1268034221 ecr 1268034221], length 0
14:48:23.296913 IP localhost.18123 > localhost.36250: Flags [P.], seq 1:572, ack 327, win 512, options [nop,nop,TS val 1268034619 ecr 1268034221], length 571
14:48:23.296947 IP localhost.36250 > localhost.18123: Flags [.], ack 572, win 508, options [nop,nop,TS val 1268034619 ecr 1268034619], length 0
14:48:23.296982 IP localhost.18123 > localhost.36250: Flags [P.], seq 572:574, ack 327, win 512, options [nop,nop,TS val 1268034619 ecr 1268034619], length 2
14:48:23.296998 IP localhost.36250 > localhost.18123: Flags [.], ack 574, win 508, options [nop,nop,TS val 1268034619 ecr 1268034619], length 0
14:48:23.297173 IP localhost.18123 > localhost.36250: Flags [P.], seq 574:579, ack 327, win 512, options [nop,nop,TS val 1268034619 ecr 1268034619], length 5
14:48:23.297199 IP localhost.36250 > localhost.18123: Flags [.], ack 579, win 508, options [nop,nop,TS val 1268034619 ecr 1268034619], length 0
14:48:23.297505 IP localhost.36250 > localhost.18123: Flags [R.], seq 327, ack 579, win 512, options [nop,nop,TS val 1268034620 ecr 1268034619], length 0
14:48:23.298549 IP localhost.36256 > localhost.18123: Flags [S], seq 657571832, win 65495, options [mss 65495,sackOK,TS val 1268034621 ecr 0,nop,wscale 7], length 0
14:48:23.298570 IP localhost.18123 > localhost.36256: Flags [S.], seq 241782485, ack 657571833, win 65483, options [mss 65495,sackOK,TS val 1268034621 ecr 1268034621,nop,wscale 7], length 0
14:48:23.298591 IP localhost.36256 > localhost.18123: Flags [.], ack 1, win 512, options [nop,nop,TS val 1268034621 ecr 1268034621], length 0
14:48:23.298709 IP localhost.36256 > localhost.18123: Flags [P.], seq 1:322, ack 1, win 512, options [nop,nop,TS val 1268034621 ecr 1268034621], length 321
14:48:23.298723 IP localhost.18123 > localhost.36256: Flags [.], ack 322, win 510, options [nop,nop,TS val 1268034621 ecr 1268034621], length 0
14:48:23.298771 IP localhost.36256 > localhost.18123: Flags [P.], seq 322:327, ack 1, win 512, options [nop,nop,TS val 1268034621 ecr 1268034621], length 5
14:48:23.298782 IP localhost.18123 > localhost.36256: Flags [.], ack 327, win 510, options [nop,nop,TS val 1268034621 ecr 1268034621], length 0
14:48:23.697211 IP localhost.18123 > localhost.36256: Flags [P.], seq 1:574, ack 327, win 512, options [nop,nop,TS val 1268035019 ecr 1268034621], length 573
14:48:23.697241 IP localhost.36256 > localhost.18123: Flags [.], ack 574, win 508, options [nop,nop,TS val 1268035019 ecr 1268035019], length 0
14:48:23.697381 IP localhost.18123 > localhost.36256: Flags [P.], seq 574:579, ack 327, win 512, options [nop,nop,TS val 1268035020 ecr 1268035019], length 5
14:48:23.697395 IP localhost.36256 > localhost.18123: Flags [.], ack 579, win 508, options [nop,nop,TS val 1268035020 ecr 1268035020], length 0
14:48:23.697825 IP localhost.36256 > localhost.18123: Flags [R.], seq 327, ack 579, win 512, options [nop,nop,TS val 1268035020 ecr 1268035020], length 0

tcpdump output - fixed version

17:09:29.190648 IP localhost.34898 > localhost.18123: Flags [S], seq 270947609, win 65495, options [mss 65495,sackOK,TS val 1270301667 ecr 0,nop,wscale 7], length 0
17:09:29.190666 IP localhost.18123 > localhost.34898: Flags [S.], seq 1708423778, ack 270947610, win 65483, options [mss 65495,sackOK,TS val 1270301667 ecr 1270301667,nop,wscale 7], length 0
17:09:29.190679 IP localhost.34898 > localhost.18123: Flags [.], ack 1, win 512, options [nop,nop,TS val 1270301667 ecr 1270301667], length 0
17:09:29.190982 IP localhost.34898 > localhost.18123: Flags [P.], seq 1:322, ack 1, win 512, options [nop,nop,TS val 1270301667 ecr 1270301667], length 321
17:09:29.190991 IP localhost.18123 > localhost.34898: Flags [.], ack 322, win 510, options [nop,nop,TS val 1270301667 ecr 1270301667], length 0
17:09:29.191197 IP localhost.34898 > localhost.18123: Flags [P.], seq 322:327, ack 1, win 512, options [nop,nop,TS val 1270301667 ecr 1270301667], length 5
17:09:29.191205 IP localhost.18123 > localhost.34898: Flags [.], ack 327, win 510, options [nop,nop,TS val 1270301667 ecr 1270301667], length 0
17:09:29.429992 IP localhost.18123 > localhost.34898: Flags [P.], seq 1:572, ack 327, win 512, options [nop,nop,TS val 1270301906 ecr 1270301667], length 571
17:09:29.430004 IP localhost.34898 > localhost.18123: Flags [.], ack 572, win 508, options [nop,nop,TS val 1270301906 ecr 1270301906], length 0
17:09:29.430018 IP localhost.18123 > localhost.34898: Flags [P.], seq 572:574, ack 327, win 512, options [nop,nop,TS val 1270301906 ecr 1270301906], length 2
17:09:29.430024 IP localhost.34898 > localhost.18123: Flags [.], ack 574, win 508, options [nop,nop,TS val 1270301906 ecr 1270301906], length 0
17:09:29.430503 IP localhost.18123 > localhost.34898: Flags [P.], seq 574:579, ack 327, win 512, options [nop,nop,TS val 1270301907 ecr 1270301906], length 5
17:09:29.430514 IP localhost.34898 > localhost.18123: Flags [.], ack 579, win 508, options [nop,nop,TS val 1270301907 ecr 1270301907], length 0
17:09:29.434663 IP localhost.34898 > localhost.18123: Flags [P.], seq 327:648, ack 579, win 512, options [nop,nop,TS val 1270301911 ecr 1270301907], length 321
17:09:29.434691 IP localhost.18123 > localhost.34898: Flags [.], ack 648, win 510, options [nop,nop,TS val 1270301911 ecr 1270301911], length 0
17:09:29.434756 IP localhost.34898 > localhost.18123: Flags [P.], seq 648:653, ack 579, win 512, options [nop,nop,TS val 1270301911 ecr 1270301911], length 5
17:09:29.434763 IP localhost.18123 > localhost.34898: Flags [.], ack 653, win 510, options [nop,nop,TS val 1270301911 ecr 1270301911], length 0
17:09:29.822719 IP localhost.18123 > localhost.34898: Flags [P.], seq 579:1152, ack 653, win 512, options [nop,nop,TS val 1270302299 ecr 1270301911], length 573
17:09:29.822892 IP localhost.18123 > localhost.34898: Flags [P.], seq 1152:1157, ack 653, win 512, options [nop,nop,TS val 1270302299 ecr 1270301911], length 5
17:09:29.823451 IP localhost.34898 > localhost.18123: Flags [.], ack 1157, win 512, options [nop,nop,TS val 1270302300 ecr 1270302299], length 0
17:09:29.825994 IP localhost.34898 > localhost.18123: Flags [P.], seq 653:974, ack 1157, win 512, options [nop,nop,TS val 1270302302 ecr 1270302299], length 321
17:09:29.826013 IP localhost.18123 > localhost.34898: Flags [.], ack 974, win 510, options [nop,nop,TS val 1270302302 ecr 1270302302], length 0
17:09:29.826104 IP localhost.34898 > localhost.18123: Flags [P.], seq 974:979, ack 1157, win 512, options [nop,nop,TS val 1270302302 ecr 1270302302], length 5
17:09:29.826114 IP localhost.18123 > localhost.34898: Flags [.], ack 979, win 510, options [nop,nop,TS val 1270302302 ecr 1270302302], length 0
17:09:30.222700 IP localhost.18123 > localhost.34898: Flags [P.], seq 1157:1730, ack 979, win 512, options [nop,nop,TS val 1270302699 ecr 1270302302], length 573
17:09:30.222753 IP localhost.18123 > localhost.34898: Flags [P.], seq 1730:1735, ack 979, win 512, options [nop,nop,TS val 1270302699 ecr 1270302302], length 5
17:09:30.223104 IP localhost.34898 > localhost.18123: Flags [.], ack 1735, win 512, options [nop,nop,TS val 1270302699 ecr 1270302699], length 0
17:09:30.223744 IP localhost.34898 > localhost.18123: Flags [F.], seq 979, ack 1735, win 512, options [nop,nop,TS val 1270302700 ecr 1270302699], length 0
17:09:30.231821 IP localhost.18123 > localhost.34898: Flags [F.], seq 1735, ack 980, win 512, options [nop,nop,TS val 1270302708 ecr 1270302700], length 0
17:09:30.231868 IP localhost.34898 > localhost.18123: Flags [.], ack 1736, win 512, options [nop,nop,TS val 1270302708 ecr 1270302708], length 0
slvrtrn commented 11 months ago

Thanks for the report! This is an interesting one.

I write a proper patch if you don't have the resources.

Contributions are always welcome :)

varrocs commented 11 months ago

I've just created a pull request: https://github.com/ClickHouse/clickhouse-js/pull/203

slvrtrn commented 11 months ago

Released as 0.2.3. @varrocs, thanks for the fix!