ClickHouse / clickhouse-odbc

ODBC driver for ClickHouse
https://clickhouse.tech
Apache License 2.0
239 stars 84 forks source link

SAS integration #415

Open msestak opened 1 year ago

msestak commented 1 year ago

SAS: 9.4 M7 Clickhouse: 22.3.9.19 Clickhouse ODBC driver: 1.2.1.20220905 running on RedHat 8

We are trying to connect SAS to Clickhouse database to explore data in SAS. We noticed that for every query we run in SAS there are two queries coming to ClickHouse database: 1) SELECT * FROM table 2) real SQL query (count() here)

Example for small table (5 rows) from clickhouse-server.log:

2022.12.22 14:44:35.060979 [ 63572 ] {} <Trace> HTTPHandler-factory: HTTP Request for HTTPHandler-factory. Method: POST, Address: 10.200.117.11:60504, User-Agent: clickhouse-odbc/1.2.1.20220905 (Linux-4.18.0-372.13.1.el8_6.x86_64) UNICOD
E, Content Type: , Transfer Encoding: chunked, X-Forwarded-For: (none)
2022.12.22 14:44:35.061077 [ 63572 ] {} <Trace> DynamicQueryHandler: Request URI: /query?default_format=ODBCDriver2&database=bddwh&session_id=clickhouse_odbc_18017337773684653415
2022.12.22 14:44:35.061095 [ 63572 ] {} <Debug> HTTP-Session: dc9da0d3-2e45-4c07-a09b-42c6907b4eda Authenticating user 'bddwh' from 10.200.117.11:60504
2022.12.22 14:44:35.061125 [ 63572 ] {} <Debug> HTTP-Session: dc9da0d3-2e45-4c07-a09b-42c6907b4eda Authenticated with global context as user 114351ec-af4b-8cb6-7c96-605f6684b7a1
2022.12.22 14:44:35.061137 [ 63572 ] {} <Debug> HTTP-Session: dc9da0d3-2e45-4c07-a09b-42c6907b4eda Creating named session context with name: clickhouse_odbc_18017337773684653415, user_id: 114351ec-af4b-8cb6-7c96-605f6684b7a1
2022.12.22 14:44:35.061152 [ 63572 ] {} <Debug> HTTP-Session: dc9da0d3-2e45-4c07-a09b-42c6907b4eda Creating query context from session context, user_id: 114351ec-af4b-8cb6-7c96-605f6684b7a1, parent context user: bddwh
2022.12.22 14:44:35.061348 [ 63572 ] {8e3abfe8-9bfb-4fc3-8c03-ea7511b2a4a7} <Debug> executeQuery: (from 10.200.117.11:60504, user: bddwh) SELECT * FROM mala_tablica
2022.12.22 14:44:35.061567 [ 63572 ] {8e3abfe8-9bfb-4fc3-8c03-ea7511b2a4a7} <Trace> ContextAccess (bddwh): Access granted: SELECT(base_amount_hrk, por_paym_desc) ON bddwh.mala_tablica
2022.12.22 14:44:35.061610 [ 63572 ] {8e3abfe8-9bfb-4fc3-8c03-ea7511b2a4a7} <Trace> ContextAccess (bddwh): Access granted: SELECT(base_amount_hrk, por_paym_desc) ON bddwh.mala_tablica
2022.12.22 14:44:35.061638 [ 63572 ] {8e3abfe8-9bfb-4fc3-8c03-ea7511b2a4a7} <Trace> InterpreterSelectQuery: FetchColumns -> Complete
2022.12.22 14:44:35.061686 [ 63572 ] {8e3abfe8-9bfb-4fc3-8c03-ea7511b2a4a7} <Debug> bddwh.mala_tablica (d88b9cfe-720e-49ed-befc-ffc19bf9878f) (SelectExecutor): Key condition: unknown
2022.12.22 14:44:35.061708 [ 63572 ] {8e3abfe8-9bfb-4fc3-8c03-ea7511b2a4a7} <Debug> bddwh.mala_tablica (d88b9cfe-720e-49ed-befc-ffc19bf9878f) (SelectExecutor): Selected 1/1 parts by partition key, 1 parts by primary key, 1/1 marks by pri
mary key, 1 marks to read from 1 ranges
2022.12.22 14:44:35.061731 [ 63572 ] {8e3abfe8-9bfb-4fc3-8c03-ea7511b2a4a7} <Trace> MergeTreeInOrderSelectProcessor: Reading 1 ranges in order from part all_6_6_0, approx. 5 rows starting from 0
2022.12.22 14:44:35.062033 [ 63572 ] {8e3abfe8-9bfb-4fc3-8c03-ea7511b2a4a7} <Information> executeQuery: Read 5 rows, 140.00 B in 0.000655897 sec., 7623 rows/sec., 208.45 KiB/sec.
2022.12.22 14:44:35.062067 [ 63572 ] {8e3abfe8-9bfb-4fc3-8c03-ea7511b2a4a7} <Debug> DynamicQueryHandler: Done processing query
2022.12.22 14:44:35.062076 [ 63572 ] {8e3abfe8-9bfb-4fc3-8c03-ea7511b2a4a7} <Debug> MemoryTracker: Peak memory usage (for query): 0.00 B.
2022.12.22 14:44:35.062092 [ 63572 ] {} <Debug> HTTP-Session: dc9da0d3-2e45-4c07-a09b-42c6907b4eda Destroying named session 'clickhouse_odbc_18017337773684653415' of user 114351ec-af4b-8cb6-7c96-605f6684b7a1
2022.12.22 14:44:35.063968 [ 63572 ] {} <Trace> HTTPHandler-factory: HTTP Request for HTTPHandler-factory. Method: POST, Address: 10.200.117.11:60504, User-Agent: clickhouse-odbc/1.2.1.20220905 (Linux-4.18.0-372.13.1.el8_6.x86_64) UNICOD
E, Content Type: , Transfer Encoding: chunked, X-Forwarded-For: (none)
2022.12.22 14:44:35.064003 [ 63572 ] {} <Trace> DynamicQueryHandler: Request URI: /query?default_format=ODBCDriver2&database=bddwh&session_id=clickhouse_odbc_18017337773684653415
2022.12.22 14:44:35.064013 [ 63572 ] {} <Debug> HTTP-Session: df22e01a-f979-41a5-9e1b-393e3a3f98c6 Authenticating user 'bddwh' from 10.200.117.11:60504
2022.12.22 14:44:35.064021 [ 63572 ] {} <Debug> HTTP-Session: df22e01a-f979-41a5-9e1b-393e3a3f98c6 Authenticated with global context as user 114351ec-af4b-8cb6-7c96-605f6684b7a1
2022.12.22 14:44:35.064029 [ 63572 ] {} <Debug> HTTP-Session: df22e01a-f979-41a5-9e1b-393e3a3f98c6 Creating named session context with name: clickhouse_odbc_18017337773684653415, user_id: 114351ec-af4b-8cb6-7c96-605f6684b7a1
2022.12.22 14:44:35.064038 [ 63572 ] {} <Debug> HTTP-Session: df22e01a-f979-41a5-9e1b-393e3a3f98c6 Creating query context from session context, user_id: 114351ec-af4b-8cb6-7c96-605f6684b7a1, parent context user: bddwh
2022.12.22 14:44:35.064151 [ 63572 ] {753a9181-07e8-4db8-88cc-0a5d35e1383b} <Debug> executeQuery: (from 10.200.117.11:60504, user: bddwh)  select COUNT(TXT_1.`base_amount_hrk`) from bddwh.mala_tablica TXT_1
2022.12.22 14:44:35.064288 [ 63572 ] {753a9181-07e8-4db8-88cc-0a5d35e1383b} <Trace> ContextAccess (bddwh): Access granted: SELECT(base_amount_hrk) ON bddwh.mala_tablica
2022.12.22 14:44:35.064323 [ 63572 ] {753a9181-07e8-4db8-88cc-0a5d35e1383b} <Trace> ContextAccess (bddwh): Access granted: SELECT(base_amount_hrk) ON bddwh.mala_tablica
2022.12.22 14:44:35.064337 [ 63572 ] {753a9181-07e8-4db8-88cc-0a5d35e1383b} <Trace> InterpreterSelectQuery: FetchColumns -> Complete
2022.12.22 14:44:35.064375 [ 63572 ] {753a9181-07e8-4db8-88cc-0a5d35e1383b} <Debug> bddwh.mala_tablica (d88b9cfe-720e-49ed-befc-ffc19bf9878f) (SelectExecutor): Key condition: unknown
2022.12.22 14:44:35.064386 [ 63572 ] {753a9181-07e8-4db8-88cc-0a5d35e1383b} <Debug> bddwh.mala_tablica (d88b9cfe-720e-49ed-befc-ffc19bf9878f) (SelectExecutor): Selected 1/1 parts by partition key, 1 parts by primary key, 1/1 marks by pri
mary key, 1 marks to read from 1 ranges
2022.12.22 14:44:35.064396 [ 63572 ] {753a9181-07e8-4db8-88cc-0a5d35e1383b} <Trace> MergeTreeInOrderSelectProcessor: Reading 1 ranges in order from part all_6_6_0, approx. 5 rows starting from 0
2022.12.22 14:44:35.064535 [ 63572 ] {753a9181-07e8-4db8-88cc-0a5d35e1383b} <Trace> AggregatingTransform: Aggregating
2022.12.22 14:44:35.064543 [ 63572 ] {753a9181-07e8-4db8-88cc-0a5d35e1383b} <Trace> Aggregator: Aggregation method: without_key
2022.12.22 14:44:35.064555 [ 63572 ] {753a9181-07e8-4db8-88cc-0a5d35e1383b} <Debug> AggregatingTransform: Aggregated. 5 to 1 rows (from 40.00 B) in 0.000123054 sec. (40632.568 rows/sec., 317.44 KiB/sec.)
2022.12.22 14:44:35.064558 [ 63572 ] {753a9181-07e8-4db8-88cc-0a5d35e1383b} <Trace> Aggregator: Merging aggregated data
2022.12.22 14:44:35.064639 [ 63572 ] {753a9181-07e8-4db8-88cc-0a5d35e1383b} <Information> executeQuery: Read 5 rows, 40.00 B in 0.000467699 sec., 10690 rows/sec., 83.52 KiB/sec.
2022.12.22 14:44:35.064675 [ 63572 ] {753a9181-07e8-4db8-88cc-0a5d35e1383b} <Debug> DynamicQueryHandler: Done processing query
2022.12.22 14:44:35.064682 [ 63572 ] {753a9181-07e8-4db8-88cc-0a5d35e1383b} <Debug> MemoryTracker: Peak memory usage (for query): 0.00 B.
2022.12.22 14:44:35.064689 [ 63572 ] {} <Debug> HTTP-Session: df22e01a-f979-41a5-9e1b-393e3a3f98c6 Destroying named session 'clickhouse_odbc_18017337773684653415' of user 114351ec-af4b-8cb6-7c96-605f6684b7a1

How to disable this first query (SELECT * FROM table) because it breaks on bigger tables:

2022.12.22 14:07:46.561707 [ 63572 ] {fecad5ac-c066-4154-b9f6-a49f4c8b2719} <Error> DynamicQueryHandler: Cannot send exception to client: Code: 24. DB::Exception: Cannot write to ostream at offset 1048930. (CANNOT_WRITE_TO_OSTREAM), Stac
k trace (when copying this message, always include the lines below):

0. DB::Exception::Exception(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, int, bool) @ 0xb3744fa in /usr/bin/clickhouse
1. DB::WriteBufferFromOStream::nextImpl() @ 0xb481b32 in /usr/bin/clickhouse
2. DB::WriteBufferFromHTTPServerResponse::nextImpl() @ 0x168d8ad1 in /usr/bin/clickhouse
3. DB::HTTPHandler::trySendExceptionToClient(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, int, DB::HTTPServerRequest&, DB::HTTPServerResponse&, DB::HTTPHandler::Output&) @ 0x16649a27 in /u
sr/bin/clickhouse
4. DB::HTTPHandler::handleRequest(DB::HTTPServerRequest&, DB::HTTPServerResponse&) @ 0x1664b2f5 in /usr/bin/clickhouse
5. DB::HTTPServerConnection::run() @ 0x168d1b5b in /usr/bin/clickhouse
6. Poco::Net::TCPServerConnection::start() @ 0x19b899ef in /usr/bin/clickhouse
7. Poco::Net::TCPServerDispatcher::run() @ 0x19b8be41 in /usr/bin/clickhouse
8. Poco::PooledThread::run() @ 0x19d48e09 in /usr/bin/clickhouse
9. Poco::ThreadImpl::runnableEntry(void*) @ 0x19d46160 in /usr/bin/clickhouse
10. start_thread @ 0x81cf in /usr/lib64/libpthread-2.28.so
11. clone @ 0x39dd3 in /usr/lib64/libc-2.28.so
 (version 22.3.9.19 (official build))
2022.12.22 14:07:46.561736 [ 63572 ] {fecad5ac-c066-4154-b9f6-a49f4c8b2719} <Debug> MemoryTracker: Peak memory usage (for query): 518.69 MiB.
2022.12.22 14:07:46.561756 [ 63572 ] {} <Debug> MemoryTracker: Peak memory usage (for query): 518.69 MiB.
2022.12.22 14:07:46.561768 [ 63572 ] {} <Debug> HTTP-Session: db31310d-def6-4567-9d74-7de10fc7c265 Destroying named session 'clickhouse_odbc_1405935089709214907' of user 114351ec-af4b-8cb6-7c96-605f6684b7a1
2022.12.22 14:07:46.562020 [ 63572 ] {} <Error> ServerErrorHandler: Poco::Exception. Code: 1000, e.code() = 32, I/O error: Broken pipe, Stack trace (when copying this message, always include the lines below):

Driver error log:

1671714466781968 [3135137:139993351079680] /clickhouse-odbc/driver/api/odbc.cpp:493 in SQLPrepareW: SQLPrepareW statement_text_size=34 statement_text=0x7f52ac060d90
1671714466782036 [3135137:139993351079680] /clickhouse-odbc/driver/statement.cpp:126 in requestNextPackOfResultSets: POST bdtclick10.bdtest.sit.pbz.hr/query?default_format=ODBCDriver2&database=bddwh&session_id=clickhouse_odbc_1405935089709214907 body=SELECT * FROM sas_io_ca_clcd_crm_3 UA=clickhouse-odbc/1.2.1.20220905 (Linux-4.18.0-372.13.1.el8_6.x86_64) UNICODE
1671714466811519 [3135137:139993351079680] /clickhouse-odbc/driver/api/odbc.cpp:685 in operator(): operator() column_number=1 name=ID type=8 size=15 nullable=0
1671714466811558 [3135137:139993351079680] /clickhouse-odbc/driver/api/odbc.cpp:556 in SQLColAttributeW: SQLColAttributeW(col=1, field=9)
1671714466811573 [3135137:139993351079680] /clickhouse-odbc/driver/api/odbc.cpp:556 in SQLColAttributeW: SQLColAttributeW(col=1, field=10)
...
1671714466978835 [3135137:139992669746944] /clickhouse-odbc/driver/api/odbc.cpp:72 in SQLFreeStmt: SQLFreeStmt option=0
1671714466978851 [3135137:139992669746944] /clickhouse-odbc/driver/api/odbc.cpp:493 in SQLPrepareW: SQLPrepareW statement_text_size=34 statement_text=0x7f5284004340
1671714466978919 [3135137:139992669746944] /clickhouse-odbc/driver/statement.cpp:126 in requestNextPackOfResultSets: POST bdtclick10.bdtest.sit.pbz.hr/query?default_format=ODBCDriver2&database=bddwh&session_id=clickhouse_odbc_1405935089709214907 body=SELECT * FROM sas_io_ca_clcd_crm_3 UA=clickhouse-odbc/1.2.1.20220905 (Linux-4.18.0-372.13.1.el8_6.x86_64) UNICODE
1671714466979125 [3135137:139992669746944] /clickhouse-odbc/driver/statement.cpp:153 in requestNextPackOfResultSets: Http request try=1/3 failed: Malformed message: No HTTP response header
1671714466983277 [3135137:139992669746944] /clickhouse-odbc/driver/statement.cpp:167 in requestNextPackOfResultSets: HTTP status code: 500
Received error:
Code: 373. DB::Exception: Session is locked by a concurrent client. (SESSION_IS_LOCKED) (version 22.3.9.19 (official build))

1671714466983300 [3135137:139992669746944] /clickhouse-odbc/driver/driver.h:250 in operator(): HY000 (HTTP status code: 500
Received error:
Code: 373. DB::Exception: Session is locked by a concurrent client. (SESSION_IS_LOCKED) (version 22.3.9.19 (official build))

)
1671714466983419 [3135137:139992669746944] /clickhouse-odbc/driver/api/odbc.cpp:1594 in SQLEndTran: SQLEndTran
1671714466983434 [3135137:139992669746944] /clickhouse-odbc/driver/api/odbc.cpp:57 in SQLFreeHandle: SQLFreeHandle handleType=3 handle=0x7f5284001cc0
1671714468120829 [3135137:139993351079680] /clickhouse-odbc/driver/api/odbc.cpp:72 in SQLFreeStmt: SQLFreeStmt option=0
1671714468121128 [3135137:139993351079680] /clickhouse-odbc/driver/api/odbc.cpp:57 in SQLFreeHandle: SQLFreeHandle handleType=3 handle=0x7f52ac011150

and it slows everything down even if it succeeds.

bkuspayev commented 1 year ago

any solution? We faced the same issue. May be any temporary solution for make it work?

mshustov commented 12 months ago

The problem is that SAS (or another tool) does SELECT * FROM table, which crashes the server. If there's no way to change this logic in SAS, you can work it around by setting max_query_size. but you need to be aware that SAS won't have all the data retrieved from DB.

mshustov commented 11 months ago

@bkuspayev, what driver version do you use? The latest version generates a random session id for connections https://github.com/ClickHouse/clickhouse-odbc/releases/tag/v1.2.1.20220905

bkuspayev commented 11 months ago

@mshustov thank you for quick response, I'm using the exact same version as you mentioned: v1.2.1.20220905

mshustov commented 11 months ago

@bkuspayev do you specify a session name manually? https://github.com/ClickHouse/clickhouse-odbc/pull/398/files#diff-bf39b744552791b8ebec2bcfdab832d997df90796ebf432e5c5fb13ce903df1bR118

bkuspayev commented 11 months ago

I'm not familiar with the internal workings of SAS and its session naming conventions. I haven't made any manual modifications on my end. Here are the parameters from the ODBC ini file:

[Clickhouse] Driver: /sas/clickhouse-odbc-1.2.1-Linux/lib64/libclickhouseodbc.so Url: http://user@server:8123/query?database=db&buffer_size=30000&wait_ens_of_query=0&session_timeout=1 DriverUnicodeType: 1 DriverLog: on Timeout: 5000 Connection: Unique

Let me know if you need further information.