apache / kyuubi

Apache Kyuubi is a distributed and multi-tenant gateway to provide serverless SQL on data warehouses and lakehouses.
https://kyuubi.apache.org/
Apache License 2.0
2.01k stars 880 forks source link

[TASK][MEDIUM] Allow returning custom response for GetInfo request #5371

Open pan3793 opened 9 months ago

pan3793 commented 9 months ago

Code of Conduct

Search before creating

Mentor

Skill requirements

Background and Goals

Kyuubi implements the Hive-compatible Thrift-based API, just as Impala and Spark Thrift Server do, so technically, the Hive JDBC/ODBC clients based on Thrift-API should work smoothly with Apache Kyuubi.

Unfortunately, we found some clients verify the GetInfo results and may reject the connection from an unrecognized Server. See details at https://github.com/apache/kyuubi/issues/3032

After https://github.com/apache/kyuubi/pull/3122, it allows a return of either SERVER’s information or ENGINE’s information in the GetInfo response, but it is not always sufficient, we may want to make it configurable so that the user can configure Kyuubi to return any information they want, to make it’s easy to impersonate any kind of Server to allow ODBC clients like PowerBI and Tableau to connect.

Implementation steps

Currently, the value candidates of kyuubi.server.info.provider are SERVER and ENGINE, we can introduce a new option CUSTOM with additional configuration to allow users to configure each property of GetInfo response.

Additional context

Introduction of https://github.com/apache/kyuubi/issues/6232

BruceWong96 commented 8 months ago

Hi @pan3793, I am very interested in this task, can you assign it to me ? Thanks.

pan3793 commented 8 months ago

@BruceWong96 thanks, do you have an estimated deadline for this task?

BruceWong96 commented 8 months ago

@BruceWong96 thanks, do you have an estimated deadline for this task?

If everything goes well, I will do my best to finish by November 15th.

BruceWong96 commented 7 months ago

Hello, @pan3793 Sorry for missing the time, but I realized when I was writing the code that I must have misunderstood the work plan How can the getInfo interface customize what it returns? In what form do we customize key and value? Do we provide an abstract interface for the user to implement? Because I'm not sure what kind of information the user wants to request. I need some more detailed guidance.

Thank you.

pan3793 commented 7 months ago

do we customize key and value?

I think so. define a configuration prefix, and the user could supply a serial of key-values then.

Do we provide an abstract interface for the user to implement?

Not yet. At least it is not covered by this task's scope. But I'm open to discuss the details if you think it's necessary or it's useful in some cases

I'm not sure what kind of information the user wants to request.

There were several close-source ODBC driver implementations, which is compatible with the Hive thrift protocol, but may not be used to connect to Kyuubi, because they can not recognize the information returned from GetInfo

BruceWong96 commented 7 months ago

There were several close-source ODBC driver implementations, which is compatible with the Hive thrift protocol, but may not be used to connect to Kyuubi, because they can not recognize the information returned from GetInfo

Function getInfo

image

About some details, as above code. The closed-source ODBC driver carries the request code when it sends the request getInfo. The parameter type of getInfo is TGetInfoType. Should the client custom key also be TGetInfoType?

Do you have any more ideas about the client's request parameters?

Looking forward to your reply.

pan3793 commented 7 months ago

Should the client custom key also be TGetInfoType?

Yea, the client MUST respect the Hive-defined thrift protocol.