ClickHouse / clickhouse-connect

Python driver/sqlalchemy/superset connectors
Apache License 2.0
329 stars 64 forks source link

Feature Request: Implement Progress Tracking for Long-Running Queries #285

Open xujryan opened 10 months ago

xujryan commented 10 months ago

Description:

I would like to suggest the implementation of a progress tracking mechanism for long-running queries, such as insert from S3. This feature could be incredibly beneficial in monitoring the execution of these extensive operations.

Motivation:

In many cases, queries in ClickHouse can take a substantial amount of time to execute, ranging from several minutes to hours or even days. During such long-running operations, users currently do not have a way to monitor the progress of these queries. Implementing a progress tracking feature, akin to what the ClickHouse CLI client offers, would be extremely beneficial. This would not only improve the user experience by providing real-time updates on query execution but also help in diagnosing and troubleshooting any issues that might arise during the execution of these lengthy queries.

genzgd commented 10 months ago

Unfortunately there's no way to do this currently using existing Python http libraries and ClickHouse's HTTP 1.1 interface. While intermediate progress headers are returned by ClickHouse, neither the requests or httpx library actually read those headers. So it will take a fair amount of work on either the Python or the ClickHouse side (in the form of HTTP2 support possibly) to implement this feature.