datafuselabs / databend

𝗗𝗮𝘁𝗮, 𝗔𝗻𝗮𝗹𝘆𝘁𝗶𝗰𝘀 & 𝗔𝗜. Modern alternative to Snowflake. Cost-effective and simple for massive-scale analytics. https://databend.com
https://docs.databend.com
Other
7.31k stars 704 forks source link

[query/server/http] Add /v1/query. #2688

Closed youngsofun closed 2 years ago

youngsofun commented 2 years ago

I hereby agree to the terms of the CLA available at: https://databend.rs/policies/cla/

Summary

Add new endpoint /v1/query to support sql.

handler usage

  1. post query with JSON with format HttpQueryRequest, and return JSON with format QueryResponse
  2. Pagination with long polling.
  3. result data is embedded in QueryResponse.
  4. return the following 4 URIs, the client should rely on the endpoint /v1/query and URIs returned in the response. the other endpoints may change without notices.
    1. next: get the next page (together with progress) with long polling. page N is released when Page N+1 is requested.
    2. state: return struct like QueryResponse, but the data field is empty. short polling.
    3. kill: kill running query
    4. delete: kill running query and delete it

test_async can be a demo for their usage.

pub struct HttpQueryRequest {
    pub sql: String,
}
pub struct QueryResponse {
    pub id: Option<String>,
    pub columns: Option<DataSchemaRef>,
    pub data: JsonBlockRef,
    pub next_uri: Option<String>,
    pub state_uri: Option<String>,
    pub kill_uri: Option<String>,
    pub delete_uri: Option<String>,
    pub query_state: Option<String>,
    pub request_error: Option<String>,
    pub query_error: Option<String>,
    pub query_progress: Option<ProgressValues>,
}

internal

  1. The web layer (query_handlers.rs) is separated from a more structured internal implementation in mod v1/query.
    1. /v1/statement is kept to post raw SQL, may reuse the internal implementation, this will be done some others changes in another PR.
    2. hope it may event be used to support GRPC if need later?
  2. make sure the internal tokio task stop fast when the query is killed by SQL command or HTTP request

TODO

soon( new PR in 1 week or so):

  1. classification and organization of Errors returned will be polished later.
  2. more tests.
  3. rewrite /v1/statements.

Long term:

  1. Add client session support.
  2. Adjust the handler interface to a stable version with a formal doc. only necessary fields are added currently. including 2 JSON formats and optional URL parameters and headers.
  3. better control of memory.

Changelog

Related Issues

Fixes #2604

Test Plan

Unit Tests

databend-bot commented 2 years ago

Thanks for the contribution! I have applied any labels matching special text in your PR Changelog.

Please review the labels and make any necessary changes.

codecov-commenter commented 2 years ago

Codecov Report

Merging #2688 (17d75f9) into main (8f4ba9f) will decrease coverage by 0%. The diff coverage is 100%.

Impacted file tree graph

@@          Coverage Diff           @@
##            main   #2688    +/-   ##
======================================
- Coverage     69%     69%    -1%     
======================================
  Files        619     608    -11     
  Lines      33398   32695   -703     
======================================
- Hits       23155   22640   -515     
+ Misses     10243   10055   -188     
Impacted Files Coverage Δ
common/base/src/progress.rs 77% <100%> (+6%) :arrow_up:
.../src/datasources/table/fuse/meta/table_snapshot.rs 63% <0%> (-13%) :arrow_down:
common/base/src/stop_handle.rs 63% <0%> (-12%) :arrow_down:
...rc/datasources/table/fuse/util/statistic_helper.rs 81% <0%> (-10%) :arrow_down:
...uery/src/datasources/table/fuse/table_do_append.rs 84% <0%> (-8%) :arrow_down:
common/clickhouse-srv/src/types/block/mod.rs 90% <0%> (-6%) :arrow_down:
common/streams/src/stream_progress.rs 90% <0%> (-5%) :arrow_down:
common/datavalues/src/data_value_ops.rs 40% <0%> (-5%) :arrow_down:
common/meta/sled-store/src/sled_key_space.rs 87% <0%> (-4%) :arrow_down:
common/meta/sled-store/src/sled_tree.rs 91% <0%> (-1%) :arrow_down:
... and 63 more

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 8f4ba9f...17d75f9. Read the comment docs.

youngsofun commented 2 years ago

/help

databend-bot commented 2 years ago

/assignme -- assign the issue to you /review @[username] -- take a reviewer for you /help -- show help

youngsofun commented 2 years ago

/review @zhang2014 @BohuTANG @flaneur2020

databend-bot commented 2 years ago

Take the reviewer to zhang2014 @bohutang @flaneur2020

sundy-li commented 2 years ago

I noticed query_id is generated by the server, but what if we want to retry an insert/update query?

Time1 query sends update SQL: update table set money = money + 1, the server handles it well and returns the response to the client, but the client may crash by somehow and it will retry the update SQL.

How to handle this situation?

I think the query_id can be identified by the client or make random uuid by the server if not.

youngsofun commented 2 years ago

the query_id here is in fact a handler, corresponding to an HTTP request. and a query-run in a query-server.

in this context, ​it is better for the server to generate it,because the server is not sure client-side ids will not clash.

snowflake use another optional “requestID” field in JSON for the situation you describe.

https://docs.snowflake.com/en/developer-guide/sql-api/guide.html

we can add it later. need to be maintained and maybe better names for the 2 IDs.

corrently, http query state is maintained in one query server only. client-side id and its state should be maintained globally in case the original server crash and not start again, and each retry need record its result state globally too. and it involve whether client connect with query servers directly.

however the current query_id is not supposed to be used directly by the client, just used in tests. so we are free to rename it later.


youngsofun commented 2 years ago

both IDs are needed.

In most cases, the query is "select" without modification, so in the client(wrapper of http api), a client-id can be added automatically when the query is insert/delete

flaneur2020 commented 2 years ago

kill: kill running query delete: kill running query and delete it

is it ok to only keep one delete operation without kill? (or keep the kill operation but without the delete operation) q.q

youngsofun commented 2 years ago

how about unify these two error field into one error field with a specific Error type? A typed error object might easier to extend with some more error kinds in the future.

@flaneur2020

request_error is for simple situations(page/query not exists ) may return 400/404

I just fix with commit “fail to start query" as a special error.

I`m considering removing request_error out, and returning 400/404, but have not decided on the details. is it ok to merge this pr as it is now and change later?


query_error is used together with the "query_state" field. if the state is stopped, the client will check if query_error is set and what is it. and may include more details later.

QueryError can be introduced in next pr

youngsofun commented 2 years ago

is it ok to only keep one delete operation without kill? (or keep the kill operation but without the delete operation) q.q

I think it ok to only keep delete for now.


done

flaneur2020 commented 2 years ago

how about unify these two error field into one error field with a specific Error type? A typed error object might easier to extend with some more error kinds in the future.

@flaneur2020

request_error is for simple situations(page/query not exists ) may return 400/404

I just fix with commit “fail to start query" as a special error.

I`m considering removing request_error out, and returning 400/404, but have not decided on the details.

on getting 404 / 400, maybe we could define a unified response message?the response object on getting 404 might diferentates the QueryResponse, like:

> GET /blahblahblah 
404 Not Found
{ error: "UrlNotFound", message: "url not found" }
> POST /blahblah
400 Bad Request
{ error: "BadArgument", message: "field is missing", info: {"query": "this field is required"} }

is it ok to merge this pr as it is now and change later?

sure, never mind, this is not critical. a unified error representation might make the frontend side life easier :)

youngsofun commented 2 years ago

what will happen on we started a HTTP query and it is running, then we executed a KILL in the mysql console? q.q

client will find it is killed when request next_uri or state_uri

youngsofun commented 2 years ago

on getting 404 / 400, maybe we could define a unified response message? the response object on getting 404 might diferentates the QueryResponse

this is just the part that I`m not sure.

do we really need a JSON? Maybe an error string in the body is enough for now. the JSON is needed if we want to make it easier for client program to handle it. a error code may be helpful.

{
   "code":2333
  "message": "oh no"
}

but 404 and 400 already contain the meaning of UrlNotFound and BadArgument. It seems enough for now. It is clear enough since the client also knows the URI it requests.

another reason is to use plan text is we now use poem Json Extractor. when it meat an invalid JSON, it just returns a 400 with an error message in the body. of course, we can Extract a String and decode ourselves, but I'm wondering is it worth.

.

youngsofun commented 2 years ago

@ZhiHanZ

Can't find any online and idle self-hosted runner in the current repository, account/organization that matches the required labels: 'stateful-runner'
Waiting for a self-hosted runner to pickup this job...
BohuTANG commented 2 years ago

Please merge with the main branch, the Stateful test fixed in https://github.com/datafuselabs/databend/pull/2717

youngsofun commented 2 years ago

/rerun

youngsofun commented 2 years ago

tests passed on my mac

but ci fail:

https://github.com/datafuselabs/databend/runs/4147948472?check_suite_focus=true

---- servers::clickhouse::clickhouse_handler_test::test_clickhouse_insert_to_fuse_table stdout ----
thread 'tokio-runtime-worker' panicked at 'called `Result::unwrap()` on an `Err` value: Code: 2407, displayText = targeting version 1, current version 2.

   0: common_exception::exception::ErrorCode::TableVersionMissMatch
             at /home/runner/work/databend/databend/common/exception/src/exception.rs:110:77
   1: common_meta_embedded::meta_api_impl::<impl common_meta_api::meta_api::MetaApi for common_meta_embedded::meta_embedded::MetaEmbedded>::upsert_table_option::{{closure}}
             at /home/runner/work/databend/databend/common/meta/embedded/src/meta_api_impl.rs:260:24
BohuTANG commented 2 years ago

tests passed on my mac

but ci fail:

https://github.com/datafuselabs/databend/runs/4147948472?check_suite_focus=true

---- servers::clickhouse::clickhouse_handler_test::test_clickhouse_insert_to_fuse_table stdout ----
thread 'tokio-runtime-worker' panicked at 'called `Result::unwrap()` on an `Err` value: Code: 2407, displayText = targeting version 1, current version 2.

   0: common_exception::exception::ErrorCode::TableVersionMissMatch
             at /home/runner/work/databend/databend/common/exception/src/exception.rs:110:77
   1: common_meta_embedded::meta_api_impl::<impl common_meta_api::meta_api::MetaApi for common_meta_embedded::meta_embedded::MetaEmbedded>::upsert_table_option::{{closure}}
             at /home/runner/work/databend/databend/common/meta/embedded/src/meta_api_impl.rs:260:24

cc @dantengsky @drmingdrmer

youngsofun commented 2 years ago

insert timeout

https://github.com/datafuselabs/databend/blob/main/query/src/servers/clickhouse/clickhouse_handler_test.rs#L100

Error: Code: 1000, displayText = Error insert table: Driver(Timeout).

   0: common_exception::exception::ErrorCode::UnknownException
             at /home/runner/work/databend/databend/common/exception/src/exception.rs:110:77
   1: databend_query::servers::clickhouse::clickhouse_handler_test::insert::{{closure}}
             at src/servers/clickhouse/clickhouse_handler_test.rs:197:27
   2: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
             at /rustc/b69fe57261086e70aea9d5b58819a1794bf7c121/library/core/src/future/mod.rs:80:19
   3: databend_query::servers::clickhouse::clickhouse_handler_test::test_clickhouse_insert_to_fuse_table::{{closure}}
             at src/servers/clickhouse/clickhouse_handler_test.rs:100:5

cc @sundy-li https://github.com/datafuselabs/databend/issues/2460

drmingdrmer commented 2 years ago

tests passed on my mac but ci fail: https://github.com/datafuselabs/databend/runs/4147948472?check_suite_focus=true

---- servers::clickhouse::clickhouse_handler_test::test_clickhouse_insert_to_fuse_table stdout ----
thread 'tokio-runtime-worker' panicked at 'called `Result::unwrap()` on an `Err` value: Code: 2407, displayText = targeting version 1, current version 2.

   0: common_exception::exception::ErrorCode::TableVersionMissMatch
             at /home/runner/work/databend/databend/common/exception/src/exception.rs:110:77
   1: common_meta_embedded::meta_api_impl::<impl common_meta_api::meta_api::MetaApi for common_meta_embedded::meta_embedded::MetaEmbedded>::upsert_table_option::{{closure}}
             at /home/runner/work/databend/databend/common/meta/embedded/src/meta_api_impl.rs:260:24

cc @dantengsky @drmingdrmer

This is a known issue that get_table() returns stale caches. @zhang2014 has been working on it: https://github.com/datafuselabs/databend/issues/2589#issuecomment-958784245

sunli829 commented 2 years ago

another reason is to use plan text is we now use poem Json Extractor. when it meat an invalid JSON, it just returns a 400 with an error message in the body. of course, we can Extract a String and decode ourselves, but I'm wondering is it worth.

@youngsofun

You can handle the errors of the JSON extractor yourself, like this:

#[poem::handler]
fn index(req: Result<Json<R>, ParseJsonError>) {
    let req = match req {
        Ok(req) => req,
        Err(err) => {
            // parse json error
        }
    };
}
youngsofun commented 2 years ago

@sunli829 great! thank you

youngsofun commented 2 years ago

@zhang2014 all done, thank you

databend-bot commented 2 years ago

Wait for another reviewer approval

flaneur2020 commented 2 years ago

/lgtm

databend-bot commented 2 years ago

CI Passed Reviewers Approved Let's Merge Thank you for the PR @youngsofun

databend-bot commented 2 years ago

CI Passed Reviewers Approved Let's Merge Thank you for the PR @youngsofun

youngsofun commented 2 years ago

@BohuTANG is it ok to merge?

databend-bot commented 2 years ago

CI Passed Reviewers Approved Let's Merge Thank you for the PR @youngsofun

databend-bot commented 2 years ago

CI Passed Reviewers Approved Let's Merge Thank you for the PR @youngsofun

databend-bot commented 2 years ago

CI Passed Reviewers Approved Let's Merge Thank you for the PR @youngsofun

databend-bot commented 2 years ago

CI Passed Reviewers Approved Let's Merge Thank you for the PR @youngsofun

databend-bot commented 2 years ago

CI Passed Reviewers Approved Let's Merge Thank you for the PR @youngsofun

databend-bot commented 2 years ago

CI Passed Reviewers Approved Let's Merge Thank you for the PR @youngsofun

databend-bot commented 2 years ago

CI Passed Reviewers Approved Let's Merge Thank you for the PR @youngsofun

databend-bot commented 2 years ago

CI Passed Reviewers Approved Let's Merge Thank you for the PR @youngsofun

databend-bot commented 2 years ago

CI Passed Reviewers Approved Let's Merge Thank you for the PR @youngsofun

databend-bot commented 2 years ago

CI Passed Reviewers Approved Let's Merge Thank you for the PR @youngsofun

databend-bot commented 2 years ago

CI Passed Reviewers Approved Let's Merge Thank you for the PR @youngsofun

databend-bot commented 2 years ago

CI Passed Reviewers Approved Let's Merge Thank you for the PR @youngsofun

databend-bot commented 2 years ago

CI Passed Reviewers Approved Let's Merge Thank you for the PR @youngsofun

databend-bot commented 2 years ago

CI Passed Reviewers Approved Let's Merge Thank you for the PR @youngsofun

databend-bot commented 2 years ago

CI Passed Reviewers Approved Let's Merge Thank you for the PR @youngsofun

databend-bot commented 2 years ago

CI Passed Reviewers Approved Let's Merge Thank you for the PR @youngsofun

databend-bot commented 2 years ago

CI Passed Reviewers Approved Let's Merge Thank you for the PR @youngsofun

databend-bot commented 2 years ago

CI Passed Reviewers Approved Let's Merge Thank you for the PR @youngsofun

databend-bot commented 2 years ago

CI Passed Reviewers Approved Let's Merge Thank you for the PR @youngsofun

databend-bot commented 2 years ago

CI Passed Reviewers Approved Let's Merge Thank you for the PR @youngsofun