ClickHouse / clickhouse-js

Official JS client for ClickHouse DB
https://clickhouse.com
Apache License 2.0
205 stars 25 forks source link

inserting with JSONColumnsWithMetadata crashes ClickHouse #226

Closed movy closed 6 months ago

movy commented 6 months ago

(maybe it belongs to the upstream CH repo, please move the issue in such case)

As I understand, JSONColumnsWithMetadata is supported with object insert mode, yet I cannot make it work. Maybe i'm doing something wrong, but instead of returning an error, my insert query crashes clickhouse:

Code example

await client.command({
    query: `
        CREATE TABLE IF NOT EXISTS titles_embeddings(
            source_site LowCardinality(String),
            forum_id LowCardinality(String),
            post_id String,
            title String,
            suggested_user_id Int32,
            embedding Array(Float64)
        ) ENGINE = ReplacingMergeTree() ORDER BY (source_site, forum_id)
        `
})

    await client.insert({
        table: 'titles_embeddings',
        columns: ['source_site', 'forum_id', 'post_id', 'title', 'embedding'],
        values: {
            meta: [
                { name: 'title', type: 'String' },
                { name: 'post_id', type: 'String' },
                { name: 'forum_id', type: 'String' },
                { name: 'source_site', type: 'String' },
                { name: 'embedding', type: 'Array(Float64)' },
            ],
            data: {
                title: ['post title'],
                post_id: ['12345'],
                forum_id: ['forum2'],
                source_site: ['site.com'],
                embedding: [[1.1, 2.2, 3.3]]
            },
        },
        format: 'JSONColumnsWithMetadata'
    })

Error log

2024.02.11 08:19:41.414355 [ 2671842 ] {} <Fatal> BaseDaemon: ########## Short fault info ############
2024.02.11 08:19:41.414375 [ 2671842 ] {} <Fatal> BaseDaemon: (version 24.1.3.31 (official build), build id: E65ACEFD4C4A4F209A1529998C6032754B52A0FC, git hash: 135b08cbd28a5832e9e70c3b7d09dd4134845ed3) (from thread 2671015) Received signal 11
2024.02.11 08:19:41.414382 [ 2671842 ] {} <Fatal> BaseDaemon: Signal description: Segmentation fault
2024.02.11 08:19:41.414386 [ 2671842 ] {} <Fatal> BaseDaemon: Address: 0x28. Access: read. Address not mapped to object.
2024.02.11 08:19:41.414393 [ 2671842 ] {} <Fatal> BaseDaemon: Stack trace: 0x000000000e8920f7 0x0000000012aba4bc 0x0000000012ac06c0 0x0000000012abc3ce 0x000000001299fa96 0x000000001297cb15 0x000000001297c563 0x000000001299553a 0x000000001298bf90 0x000000001298b1a0 0x0000000012989932 0x000000001190b283 0x00000000128bd1ba 0x00000000128c20f0 0x000000001294029a 0x00000000153a9932 0x00000000153aa731 0x00000000154a2f07 0x00000000154a153d 0x00007f23b2694ac3 0x00007f23b2726a40
2024.02.11 08:19:41.414397 [ 2671842 ] {} <Fatal> BaseDaemon: ########################################
2024.02.11 08:19:41.414404 [ 2671842 ] {} <Fatal> BaseDaemon: (version 24.1.3.31 (official build), build id: E65ACEFD4C4A4F209A1529998C6032754B52A0FC, git hash: 135b08cbd28a5832e9e70c3b7d09dd4134845ed3) (from thread 2671015) (query_id: 702552e6-a407-497a-b66a-3d84ed6fe6c5) (query: INSERT INTO titles_embeddings FORMAT JSONColumnsWithMetadata
) Received signal Segmentation fault (11)
2024.02.11 08:19:41.414407 [ 2671842 ] {} <Fatal> BaseDaemon: Address: 0x28. Access: read. Address not mapped to object.
2024.02.11 08:19:41.414410 [ 2671842 ] {} <Fatal> BaseDaemon: Stack trace: 0x000000000e8920f7 0x0000000012aba4bc 0x0000000012ac06c0 0x0000000012abc3ce 0x000000001299fa96 0x000000001297cb15 0x000000001297c563 0x000000001299553a 0x000000001298bf90 0x000000001298b1a0 0x0000000012989932 0x000000001190b283 0x00000000128bd1ba 0x00000000128c20f0 0x000000001294029a 0x00000000153a9932 0x00000000153aa731 0x00000000154a2f07 0x00000000154a153d 0x00007f23b2694ac3 0x00007f23b2726a40
2024.02.11 08:19:41.414449 [ 2671842 ] {} <Fatal> BaseDaemon: 2. std::__hash_const_iterator<std::__hash_node<std::__hash_value_type<String, unsigned long>, void*>*> std::__hash_table<std::__hash_value_type<String, unsigned long>, std::__unordered_map_hasher<String, std::__hash_value_type<String, unsigned long>, std::hash<String>, std::equal_to<String>, true>, std::__unordered_map_equal<String, std::__hash_value_type<String, unsigned long>, std::equal_to<String>, std::hash<String>, true>, std::allocator<std::__hash_value_type<String, unsigned long>>>::find<String>(String const&) const @ 0x000000000e8920f7 in /usr/bin/clickhouse
2024.02.11 08:19:41.414485 [ 2671842 ] {} <Fatal> BaseDaemon: 3. DB::JSONUtils::validateMetadataByHeader(DB::NamesAndTypesList const&, DB::Block const&) @ 0x0000000012aba4bc in /usr/bin/clickhouse
2024.02.11 08:19:41.414492 [ 2671842 ] {} <Fatal> BaseDaemon: 4. DB::JSONColumnsWithMetadataReader::readChunkStart() @ 0x0000000012ac06c0 in /usr/bin/clickhouse
2024.02.11 08:19:41.414509 [ 2671842 ] {} <Fatal> BaseDaemon: 5. DB::JSONColumnsBlockInputFormatBase::read() @ 0x0000000012abc3ce in /usr/bin/clickhouse
2024.02.11 08:19:41.414537 [ 2671842 ] {} <Fatal> BaseDaemon: 6. DB::IInputFormat::generate() @ 0x000000001299fa96 in /usr/bin/clickhouse
2024.02.11 08:19:41.414552 [ 2671842 ] {} <Fatal> BaseDaemon: 7. DB::ISource::tryGenerate() @ 0x000000001297cb15 in /usr/bin/clickhouse
2024.02.11 08:19:41.414570 [ 2671842 ] {} <Fatal> BaseDaemon: 8. DB::ISource::work() @ 0x000000001297c563 in /usr/bin/clickhouse
2024.02.11 08:19:41.414589 [ 2671842 ] {} <Fatal> BaseDaemon: 9. DB::ExecutionThreadContext::executeTask() @ 0x000000001299553a in /usr/bin/clickhouse
2024.02.11 08:19:41.414624 [ 2671842 ] {} <Fatal> BaseDaemon: 10. DB::PipelineExecutor::executeStepImpl(unsigned long, std::atomic<bool>*) @ 0x000000001298bf90 in /usr/bin/clickhouse
2024.02.11 08:19:41.414653 [ 2671842 ] {} <Fatal> BaseDaemon: 11. DB::PipelineExecutor::execute(unsigned long, bool) @ 0x000000001298b1a0 in /usr/bin/clickhouse
2024.02.11 08:19:41.414666 [ 2671842 ] {} <Fatal> BaseDaemon: 12. DB::CompletedPipelineExecutor::execute() @ 0x0000000012989932 in /usr/bin/clickhouse
2024.02.11 08:19:41.414701 [ 2671842 ] {} <Fatal> BaseDaemon: 13. DB::executeQuery(DB::ReadBuffer&, DB::WriteBuffer&, bool, std::shared_ptr<DB::Context>, std::function<void (DB::QueryResultDetails const&)>, DB::QueryFlags, std::optional<DB::FormatSettings> const&, std::function<void (DB::IOutputFormat&)>) @ 0x000000001190b283 in /usr/bin/clickhouse
2024.02.11 08:19:41.414728 [ 2671842 ] {} <Fatal> BaseDaemon: 14. DB::HTTPHandler::processQuery(DB::HTTPServerRequest&, DB::HTMLForm&, DB::HTTPServerResponse&, DB::HTTPHandler::Output&, std::optional<DB::CurrentThread::QueryScope>&, StrongTypedef<unsigned long, ProfileEvents::EventTag> const&) @ 0x00000000128bd1ba in /usr/bin/clickhouse
2024.02.11 08:19:41.414745 [ 2671842 ] {} <Fatal> BaseDaemon: 15. DB::HTTPHandler::handleRequest(DB::HTTPServerRequest&, DB::HTTPServerResponse&, StrongTypedef<unsigned long, ProfileEvents::EventTag> const&) @ 0x00000000128c20f0 in /usr/bin/clickhouse
2024.02.11 08:19:41.414771 [ 2671842 ] {} <Fatal> BaseDaemon: 16. DB::HTTPServerConnection::run() @ 0x000000001294029a in /usr/bin/clickhouse
2024.02.11 08:19:41.414789 [ 2671842 ] {} <Fatal> BaseDaemon: 17. Poco::Net::TCPServerConnection::start() @ 0x00000000153a9932 in /usr/bin/clickhouse
2024.02.11 08:19:41.414805 [ 2671842 ] {} <Fatal> BaseDaemon: 18. Poco::Net::TCPServerDispatcher::run() @ 0x00000000153aa731 in /usr/bin/clickhouse
2024.02.11 08:19:41.414819 [ 2671842 ] {} <Fatal> BaseDaemon: 19. Poco::PooledThread::run() @ 0x00000000154a2f07 in /usr/bin/clickhouse
2024.02.11 08:19:41.414839 [ 2671842 ] {} <Fatal> BaseDaemon: 20. Poco::ThreadImpl::runnableEntry(void*) @ 0x00000000154a153d in /usr/bin/clickhouse
2024.02.11 08:19:41.414853 [ 2671842 ] {} <Fatal> BaseDaemon: 21. ? @ 0x00007f23b2694ac3
2024.02.11 08:19:41.414863 [ 2671842 ] {} <Fatal> BaseDaemon: 22. ? @ 0x00007f23b2726a40
2024.02.11 08:19:41.507255 [ 2671842 ] {} <Fatal> BaseDaemon: Integrity check of the executable successfully passed (checksum: C9E7A5A90DFFDAA9E99C379A4F672F42)
2024.02.11 08:19:41.507333 [ 2671842 ] {} <Fatal> BaseDaemon: Report this error to https://github.com/ClickHouse/ClickHouse/issues
2024.02.11 08:19:41.507416 [ 2671842 ] {} <Fatal> BaseDaemon: Changed settings: output_format_json_quote_64bit_integers = false

Expected behaviour

A client-side error message at least.

Configuration

Environment

slvrtrn commented 6 months ago

Similar behavior with curl:

echo -ne '{"meta":[{"name":"title","type":"String"},{"name":"post_id","type":"String"},{"name":"forum_id","type":"String"},{"name":"source_site","type":"String"},{"name":"embedding","type":"Array(Float64)"}],"data":{"title":["post title"],"post_id":["12345"],"forum_id":["forum2"],"source_site":["site.com"],"embedding":[[1.1,2.2,3.3]]}}\n' | curl 'http://localhost:8123/?query=INSERT%20INTO%20titles_embeddings%20FORMAT%20JSONColumnsWithMetadata' --data-binary @-

Thanks for the report; I created https://github.com/ClickHouse/ClickHouse/issues/59853