apache / pinot

Apache Pinot - A realtime distributed OLAP datastore
https://pinot.apache.org/
Apache License 2.0
5.5k stars 1.29k forks source link

Java client for Controller/Broker/Server REST APIs #8246

Open yupeng9 opened 2 years ago

yupeng9 commented 2 years ago

It will be very helpful to have a library for the common REST APIs for Controller/Broker/Server. There are a number of existing methods/helpers in the code repo, such as in the test utils or connector for this purpose. But those are adhoc and not comprehensive.

Jackie-Jiang commented 2 years ago

+1

ControllerRequestURLBuilder contains most of the APIs. We can probably wrap it into a library

mcvsubbu commented 2 years ago

Agreed that this will be a useful thing.

We will just have to pay more attention to a few things then:

As long as we are wiling to bear the cost, I am all for this feature.

walterddr commented 2 years ago

+1 for this proposal. I am also working on factoring out the helper functions from ControllerTestUtils.

Here is what I proposed to do, please kindly let me know if this is the right track:

  1. summarized all the helper/utils we currently have. (in addition to the ones discussed above, many of the *RestResource in controller/broker have internal static functions to do send request and parse responses)
  2. proposal to add different level of abstractions.
    • HTTPUtils in pinot-common (for constructing pinot get/post/put/delete to RestAPI and parsing response objects)
    • ControllerUtils that utilizes the HTTPUtils to make specific requests such as getSchema, getTableConfig, etc.
    • JavaClient that wraps around the ControllerUtils and is cluster-context aware (e.g. able to configure controller URL during start up and keeps tracking the active controller)
  3. tests to ensure that they are RestAPI compatible; and ensure the client API and controller API are in sync.

Any suggestions? I will create an example PR to demonstrate what I meant soon

walterddr commented 2 years ago

see: #8293 for demonstration of the refactoring

[update] #8329 is the new POC, for several reasons

walterddr commented 2 years ago

8329 seems to work pretty well as a POC: here is my adjusted plan based on the POC

Problem

Currently there are many ad-hoc usage/utils that makes http call to Pinot components, to name a few

There are discrepancies regarding (1) what's the format of the send data / return; (2) what type of return code throws exception or encapsulate the exception in the response code; (3) what default header is being used.

What's more, there are inefficiency in HTTPClient usage: in general HTTPClient should be used as a singleton and leverage the underlying resource reused benefit.

Proposal

Propose to

  1. add HTTPClient in pinot-common that provides generic get/post/put/delete method to RestAPI and parsing response objects/status code.
    • for advance usage such as file upload download, also provide the API to return closable http response stream.
  2. ControllerRequestClient that utilizes the HTTPClient singleton to make specific requests such as getSchema, getTableConfig, etc.
    • later on we can extend to Broker/Server APIs.
  3. Replace any http request utils with the client construct (with a static reference to the singelton HTTPClient)
walterddr commented 2 years ago

update on status: with #8329 and #8390 contributed. item 1 and 3 is almost finished. we can now: add functionalities to ControllerRequestClient and create Server/BrokerRequestClients.

there are other places where either java.net.* or apache.http.* is directly used, I propose to create a separate ticket to track the migration of these usages as they are (1) many and scattered around the codebase and might not make sense to migrate all; (2) mostly will be migrated over when security/auth is enabled for example QueryRunner now doesn't support auth token.