chdb-io / chdb

chDB is an in-process OLAP SQL Engine 🚀 powered by ClickHouse
https://clickhouse.com/chdb
Apache License 2.0
2.18k stars 75 forks source link

0.11.0: recursive_mutex lock failed: Invalid argument. (STD_EXCEPTION) #60

Closed lmangani closed 1 year ago

lmangani commented 1 year ago

Stock queries are failing with chdb 0.11.0. Here's an example we know works as expected with chdb 0.10.x

query

SELECT
    town,
    district,
    count() AS c,
    round(avg(price)) AS price
FROM url('https://datasets-documentation.s3.eu-west-3.amazonaws.com/house_parquet/house_0.parquet')
GROUP BY
    town,
    district
LIMIT 10

Failing response

Code: 1001. DB::Exception: std::__1::system_error: recursive_mutex lock failed: Invalid argument. (STD_EXCEPTION)

Expected response

query results

auxten commented 1 year ago

I have fixed the chdb-server, and also some optimization. Run chdb.query without forking new process.

The old implementation will cause deadlock. still not sure if it's the reason cause the recursive_mutex lock failed.

previous implementation has a bug that causes stderr to block when the pipe buffer is full. The issue I'm experiencing here is getting stuck on the query without a chance to run read_pipe. I suspect that the recursive lock problem reported by clickhouse may be due to the same reason, but we can't confirm it now. Let's wait for @nmreadelf 's debug message.

lmangani commented 1 year ago

The issue persists with the latest chdb-server

Code: 1001. DB::Exception: std::__1::system_error: recursive_mutex lock failed: Invalid argument. (STD_EXCEPTION)

auxten commented 1 year ago

It seems another ClickHouse issue. Trying sort out why they do that.

nmreadelf commented 1 year ago

there is backtrace with debug symbol.

2023.07.17 15:57:25.466294 [ 154640 ] {} <Debug> Application: Working directory created: /tmp/clickhouse-local-154588-1689580645-13439769555555602661
Setting up /tmp/clickhouse-local-154588-1689580645-13439769555555602661/tmp/ to store temporary data in it
Added users.xml access storage 'users.xml', path:
65432cd2-c681-425d-a79b-178571bfec12 Authenticating user 'default' from 127.0.0.1:0
65432cd2-c681-425d-a79b-178571bfec12 Authenticated with global context as user 94309d50-4f52-5250-31bd-74fecac179db
65432cd2-c681-425d-a79b-178571bfec12 Creating session context with user_id: 94309d50-4f52-5250-31bd-74fecac179db
Settings: readonly = 0, allow_ddl = true, allow_introspection_functions = false
List of all grants: GRANT SHOW, SELECT, INSERT, ALTER, CREATE, DROP, UNDROP TABLE, TRUNCATE, OPTIMIZE, BACKUP, KILL QUERY, KILL TRANSACTION, MOVE PARTITION BETWEEN SHARDS, SYSTEM, dictGet, displaySecretsInShowAndSelect, INTROSPECTION, SOURCES, CLUSTER ON *.*
List of all grants including implicit: GRANT SHOW, SELECT, INSERT, ALTER, CREATE, DROP, UNDROP TABLE, TRUNCATE, OPTIMIZE, BACKUP, KILL QUERY, KILL TRANSACTION, MOVE PARTITION BETWEEN SHARDS, SYSTEM, dictGet, displaySecretsInShowAndSelect, INTROSPECTION, SOURCES, CLUSTER ON *.*
65432cd2-c681-425d-a79b-178571bfec12 Creating query context from session context, user_id: 94309d50-4f52-5250-31bd-74fecac179db, parent context user: default
Query span trace_id for opentelemetry log: 00000000-0000-0000-0000-000000000000
(from 0.0.0.0:0, user: ) SELECT town, district, count() AS c, round(avg(price)) AS price FROM url('https://datasets-documentation.s3.eu-west-3.amazonaws.com/house_parquet/house_0.parquet') GROUP BY town, district LIMIT 10 (stage: Complete)
std::exception. Code: 1001, type: std::__1::system_error, e.what() = recursive_mutex lock failed: Invalid argument (version 23.6.1.1) (from 0.0.0.0:0) (in query: SELECT town, district, count() AS c, round(avg(price)) AS price FROM url('https://datasets-documentation.s3.eu-west-3.amazonaws.com/house_parquet/house_0.parquet') GROUP BY town, district LIMIT 10), Stack trace (when copying this message, always include the lines below):

0. std::exception::capture() @ 0x000000001a1ce422 in /home/ec2user/.pyenv/versions/3.9.6/lib/python3.9/site-packages/chdb/_chdb.cpython-39-x86_64-linux-gnu.so
1. std::exception::exception[abi:v15000]() @ 0x000000001a1ce3ed in /home/ec2user/.pyenv/versions/3.9.6/lib/python3.9/site-packages/chdb/_chdb.cpython-39-x86_64-linux-gnu.so
2. ./buildlib/./contrib/llvm-project/libcxx/src/support/runtime/stdexcept_default.ipp:33: std::runtime_error::runtime_error(String const&) @ 0x000000003457a85d in /home/ec2user/.pyenv/versions/3.9.6/lib/python3.9/site-packages/chdb/_chdb.cpython-39-x86_64-linux-gnu.so
3. ./buildlib/./contrib/llvm-project/libcxx/src/system_error.cpp:253: std::system_error::system_error(std::error_code, char const*) @ 0x00000000345854f3 in /home/ec2user/.pyenv/versions/3.9.6/lib/python3.9/site-packages/chdb/_chdb.cpython-39-x86_64-linux-gnu.so
4. ./buildlib/./contrib/llvm-project/libcxx/src/system_error.cpp:290: std::__throw_system_error(int, char const*) @ 0x0000000034585a0f in /home/ec2user/.pyenv/versions/3.9.6/lib/python3.9/site-packages/chdb/_chdb.cpython-39-x86_64-linux-gnu.so
5. ./buildlib/./contrib/llvm-project/libcxx/src/mutex.cpp:79: std::recursive_mutex::lock() @ 0x0000000034572815 in /home/ec2user/.pyenv/versions/3.9.6/lib/python3.9/site-packages/chdb/_chdb.cpython-39-x86_64-linux-gnu.so
6. ./buildlib/./contrib/llvm-project/libcxx/include/__mutex_base:122: std::unique_lock<std::recursive_mutex>::unique_lock[abi:v15000](std::recursive_mutex&) @ 0x000000002ac3bd27 in /home/ec2user/.pyenv/versions/3.9.6/lib/python3.9/site-packages/chdb/_chdb.cpython-39-x86_64-linux-gnu.so
7. ./buildlib/./src/Interpreters/Context.cpp:723: DB::Context::getLock() const @ 0x000000002ac196f7 in /home/ec2user/.pyenv/versions/3.9.6/lib/python3.9/site-packages/chdb/_chdb.cpython-39-x86_64-linux-gnu.so
8. ./buildlib/./src/Interpreters/Context.cpp:1016: DB::Context::getConfigRef() const @ 0x000000002ac1de5d in /home/ec2user/.pyenv/versions/3.9.6/lib/python3.9/site-packages/chdb/_chdb.cpython-39-x86_64-linux-gnu.so
9. ./buildlib/./src/Common/NamedCollections/NamedCollectionUtils.cpp:383: DB::NamedCollectionUtils::loadIfNotUnlocked(std::unique_lock<std::mutex>&) @ 0x0000000029e58bbe in /home/ec2user/.pyenv/versions/3.9.6/lib/python3.9/site-packages/chdb/_chdb.cpython-39-x86_64-linux-gnu.so
10. ./buildlib/./src/Common/NamedCollections/NamedCollectionUtils.cpp:393: DB::NamedCollectionUtils::loadIfNot() @ 0x0000000029e58cbb in /home/ec2user/.pyenv/versions/3.9.6/lib/python3.9/site-packages/chdb/_chdb.cpython-39-x86_64-linux-gnu.so
11. ./buildlib/./src/Storages/NamedCollectionsHelpers.cpp:75: DB::tryGetNamedCollectionWithOverrides(absl::lts_20211102::InlinedVector<std::shared_ptr<DB::IAST>, 7ul, std::allocator<std::shared_ptr<DB::IAST>>>, std::shared_ptr<DB::Context const>, bool, std::vector<std::pair<String, std::shared_ptr<DB::IAST>>, std::allocator<std::pair<String, std::shared_ptr<DB::IAST>>>>*) @ 0x000000002c6137b5 in /home/ec2user/.pyenv/versions/3.9.6/lib/python3.9/site-packages/chdb/_chdb.cpython-39-x86_64-linux-gnu.so
12. ./buildlib/./src/TableFunctions/TableFunctionURL.cpp:48: DB::TableFunctionURL::parseArgumentsImpl(absl::lts_20211102::InlinedVector<std::shared_ptr<DB::IAST>, 7ul, std::allocator<std::shared_ptr<DB::IAST>>>&, std::shared_ptr<DB::Context const> const&) @ 0x0000000029b272ad in /home/ec2user/.pyenv/versions/3.9.6/lib/python3.9/site-packages/chdb/_chdb.cpython-39-x86_64-linux-gnu.so
13. ./buildlib/./src/TableFunctions/ITableFunctionFileLike.cpp:49: DB::ITableFunctionFileLike::parseArguments(std::shared_ptr<DB::IAST> const&, std::shared_ptr<DB::Context const>) @ 0x0000000029b24d41 in /home/ec2user/.pyenv/versions/3.9.6/lib/python3.9/site-packages/chdb/_chdb.cpython-39-x86_64-linux-gnu.so
14. ./buildlib/./src/TableFunctions/TableFunctionURL.cpp:43: DB::TableFunctionURL::parseArguments(std::shared_ptr<DB::IAST> const&, std::shared_ptr<DB::Context const>) @ 0x0000000029b271d7 in /home/ec2user/.pyenv/versions/3.9.6/lib/python3.9/site-packages/chdb/_chdb.cpython-39-x86_64-linux-gnu.so
15. ./buildlib/./src/TableFunctions/TableFunctionFactory.cpp:49: DB::TableFunctionFactory::get(std::shared_ptr<DB::IAST> const&, std::shared_ptr<DB::Context const>) const @ 0x0000000029ec5715 in /home/ec2user/.pyenv/versions/3.9.6/lib/python3.9/site-packages/chdb/_chdb.cpython-39-x86_64-linux-gnu.so
16. ./buildlib/./src/Interpreters/Context.cpp:1509: DB::Context::executeTableFunction(std::shared_ptr<DB::IAST> const&, DB::ASTSelectQuery const*) @ 0x000000002ac22233 in /home/ec2user/.pyenv/versions/3.9.6/lib/python3.9/site-packages/chdb/_chdb.cpython-39-x86_64-linux-gnu.so
17. ./buildlib/./src/Interpreters/JoinedTables.cpp:211: DB::JoinedTables::getLeftTableStorage() @ 0x000000002bbebd16 in /home/ec2user/.pyenv/versions/3.9.6/lib/python3.9/site-packages/chdb/_chdb.cpython-39-x86_64-linux-gnu.so
18. ./buildlib/./src/Interpreters/InterpreterSelectQuery.cpp:434: DB::InterpreterSelectQuery::InterpreterSelectQuery(std::shared_ptr<DB::IAST> const&, std::shared_ptr<DB::Context> const&, std::optional<DB::Pipe>, std::shared_ptr<DB::IStorage> const&, DB::SelectQueryOptions const&, std::vector<String, std::allocator<String>> const&, std::shared_ptr<DB::StorageInMemoryMetadata const> const&, std::shared_ptr<DB::PreparedSets>) @ 0x000000002babe234 in /home/ec2user/.pyenv/versions/3.9.6/lib/python3.9/site-packages/chdb/_chdb.cpython-39-x86_64-linux-gnu.so
19. ./buildlib/./src/Interpreters/InterpreterSelectQuery.cpp:211: DB::InterpreterSelectQuery::InterpreterSelectQuery(std::shared_ptr<DB::IAST> const&, std::shared_ptr<DB::Context> const&, DB::SelectQueryOptions const&, std::vector<String, std::allocator<String>> const&) @ 0x000000002babd748 in /home/ec2user/.pyenv/versions/3.9.6/lib/python3.9/site-packages/chdb/_chdb.cpython-39-x86_64-linux-gnu.so
20. ./buildlib/./contrib/llvm-project/libcxx/include/__memory/unique_ptr.h:714: std::__unique_if<DB::InterpreterSelectQuery>::__unique_single std::make_unique[abi:v15000]<DB::InterpreterSelectQuery, std::shared_ptr<DB::IAST> const&, std::shared_ptr<DB::Context>&, DB::SelectQueryOptions&, std::vector<String, std::allocator<String>> const&>(std::shared_ptr<DB::IAST> const&, std::shared_ptr<DB::Context>&, DB::SelectQueryOptions&, std::vector<String, std::allocator<String>> const&) @ 0x000000002bbc01c1 in /home/ec2user/.pyenv/versions/3.9.6/lib/python3.9/site-packages/chdb/_chdb.cpython-39-x86_64-linux-gnu.so
21. ./buildlib/./src/Interpreters/InterpreterSelectWithUnionQuery.cpp:254: DB::InterpreterSelectWithUnionQuery::buildCurrentChildInterpreter(std::shared_ptr<DB::IAST> const&, std::vector<String, std::allocator<String>> const&) @ 0x000000002bbbd649 in /home/ec2user/.pyenv/versions/3.9.6/lib/python3.9/site-packages/chdb/_chdb.cpython-39-x86_64-linux-gnu.so
22. ./buildlib/./src/Interpreters/InterpreterSelectWithUnionQuery.cpp:152: DB::InterpreterSelectWithUnionQuery::InterpreterSelectWithUnionQuery(std::shared_ptr<DB::IAST> const&, std::shared_ptr<DB::Context>, DB::SelectQueryOptions const&, std::vector<String, std::allocator<String>> const&) @ 0x000000002bbbcac5 in /home/ec2user/.pyenv/versions/3.9.6/lib/python3.9/site-packages/chdb/_chdb.cpython-39-x86_64-linux-gnu.so
23. ./buildlib/./contrib/llvm-project/libcxx/include/__memory/unique_ptr.h:714: std::__unique_if<DB::InterpreterSelectWithUnionQuery>::__unique_single std::make_unique[abi:v15000]<DB::InterpreterSelectWithUnionQuery, std::shared_ptr<DB::IAST>&, std::shared_ptr<DB::Context>&, DB::SelectQueryOptions const&>(std::shared_ptr<DB::IAST>&, std::shared_ptr<DB::Context>&, DB::SelectQueryOptions const&) @ 0x000000002c098c14 in /home/ec2user/.pyenv/versions/3.9.6/lib/python3.9/site-packages/chdb/_chdb.cpython-39-x86_64-linux-gnu.so
24. ./buildlib/./src/Interpreters/InterpreterFactory.cpp:160: DB::InterpreterFactory::get(std::shared_ptr<DB::IAST>&, std::shared_ptr<DB::Context>, DB::SelectQueryOptions const&) @ 0x000000002c0967c8 in /home/ec2user/.pyenv/versions/3.9.6/lib/python3.9/site-packages/chdb/_chdb.cpython-39-x86_64-linux-gnu.so
25. ./buildlib/./src/Interpreters/executeQuery.cpp:692: DB::executeQueryImpl(char const*, char const*, std::shared_ptr<DB::Context>, bool, DB::QueryProcessingStage::Enum, DB::ReadBuffer*) @ 0x000000002c04ff8a in /home/ec2user/.pyenv/versions/3.9.6/lib/python3.9/site-packages/chdb/_chdb.cpython-39-x86_64-linux-gnu.so
26. ./buildlib/./src/Interpreters/executeQuery.cpp:1168: DB::executeQuery(String const&, std::shared_ptr<DB::Context>, bool, DB::QueryProcessingStage::Enum) @ 0x000000002c04bdc4 in /home/ec2user/.pyenv/versions/3.9.6/lib/python3.9/site-packages/chdb/_chdb.cpython-39-x86_64-linux-gnu.so
27. ./buildlib/./src/Client/LocalConnection.cpp:112: DB::LocalConnection::sendQuery(DB::ConnectionTimeouts const&, String const&, std::unordered_map<String, String, std::hash<String>, std::equal_to<String>, std::allocator<std::pair<String const, String>>> const&, String const&, unsigned long, DB::Settings const*, DB::ClientInfo const*, bool, std::function<void (DB::Progress const&)>) @ 0x000000002d21ed07 in /home/ec2user/.pyenv/versions/3.9.6/lib/python3.9/site-packages/chdb/_chdb.cpython-39-x86_64-linux-gnu.so
28. ./buildlib/./src/Client/ClientBase.cpp:943: DB::ClientBase::processOrdinaryQuery(String const&, std::shared_ptr<DB::IAST>) @ 0x000000002d179d25 in /home/ec2user/.pyenv/versions/3.9.6/lib/python3.9/site-packages/chdb/_chdb.cpython-39-x86_64-linux-gnu.so
29. ./buildlib/./src/Client/ClientBase.cpp:1785: DB::ClientBase::processParsedSingleQuery(String const&, String const&, std::shared_ptr<DB::IAST>, std::optional<bool>, bool) @ 0x000000002d178953 in /home/ec2user/.pyenv/versions/3.9.6/lib/python3.9/site-packages/chdb/_chdb.cpython-39-x86_64-linux-gnu.so
30. ./buildlib/./src/Client/ClientBase.cpp:2049: DB::ClientBase::executeMultiQuery(String const&) @ 0x000000002d181437 in /home/ec2user/.pyenv/versions/3.9.6/lib/python3.9/site-packages/chdb/_chdb.cpython-39-x86_64-linux-gnu.so
31. ./buildlib/./src/Client/ClientBase.cpp:2194: DB::ClientBase::processQueryText(String const&) @ 0x000000002d182896 in /home/ec2user/.pyenv/versions/3.9.6/lib/python3.9/site-packages/chdb/_chdb.cpython-39-x86_64-linux-gnu.so

Peak memory usage (for query): 145.50 MiB.
2023.07.17 15:57:26.186701 [ 154640 ] {} <Debug> Application: Removing temporary directory: /tmp/clickhouse-local-154588-1689580645-13439769555555602661
Code: 1001. DB::Exception: std::__1::system_error: recursive_mutex lock failed: Invalid argument. (STD_EXCEPTION)
2023.07.17 15:57:26.186917 [ 154640 ] {} <Debug> Application: Uninitializing subsystem: Logging Subsystem
Peak memory usage (for user): 16.94 KiB.