prestodb / presto

The official home of the Presto distributed SQL query engine for big data
http://prestodb.io
Apache License 2.0
16.05k stars 5.38k forks source link

[native] Document v1/operation end point #21749

Open mbasmanova opened 9 months ago

mbasmanova commented 9 months ago

Prestissimo workers support "HTTP GET v1/operation" endpoint to assist in debugging production issues.

Message Format

HTTP GET v1/operation/<target>/<action>?<param1>=<value1>&<param2>=<value2>...&<paramN>=<valueN>

Send message using curl

curl "hostname:7777/v1/operation/target/action?params"

Example:

curl "hostname:7777/v1/operation/systemConfig/setProperty?name=shutdown-onset-sec&value=7"

Supported commands

CC: @spershin @tanjialiang @majetideepak @tdcmeehan

tdcmeehan commented 9 months ago

Quick question on connector/clearCache and connector/getCacheStats: I presume id refers to the catalog ID. If so, it would be nice to eventually simplify these endpoints to something along the lines of catalog/clearCache and remove the name, since it seems to be not needed if one has the catalog id.

Additionally, it would be nice to document how these endpoints are secured (I presume it is consistent with other existing worker endpoints).

tdcmeehan commented 9 months ago

A great place for this documentation would be https://github.com/prestodb/presto/blob/master/presto-docs/src/main/sphinx/develop/presto-native.rst

spershin commented 9 months ago

@tdcmeehan I don't get the suggestion: if we replace 'connector/clearCache' by the 'catalog/clearCache', why can we remove the 'name' parameter?

tdcmeehan commented 9 months ago

@spershin if name is the connector name, and id is the catalog id, then it seems you'd be able to infer the connector name from the catalog id. Is that not the case?

spershin commented 9 months ago

@tdcmeehan

I believe we have cases when 1 connector can be used in more than 1 catalog.

tdcmeehan commented 9 months ago

Right--but are there any cases where 1 catalog maps to N connectors? If not, then it seems you could just refer to the catalog.

spershin commented 9 months ago

@tdcmeehan

Probably no cases when 1 catalog maps to N connectors, as there is single property "connector.name" per catalog config. However I'm not 100% sure here, even though I don't know how to make 1 -> N mapping.

I believe the need to specify 'name' and 'id' came from the simple implementation. The cache in the code belongs to the connector, which can have multiple catalogs, not the other way around.

The endpoint can be simplified as you are saying, but, honestly, I don't see much point in it. It is not like it is being regularly or even occasionally used, so this optimization won't gain us much.

tdcmeehan commented 9 months ago

@spershin think of this issue as broadening the usage of these endpoints through public documentation, directing others outside of Meta to use them.

Speaking broadly: we want APIs to be clean and easy to use. If we wait until there is a critical mass of people using these APIs, then it's too late--it's hard to fix it at that point.

I agree this is a small change, but the overall project consists of lots of small features, and small inconveniences eventually add up, and the aspiration of this project is to get small details right when possible.

To be clear, we should document the API as it is, but I hope we can think in the future of how to better encapsulate catalog--it is quite cumbersome to have to specify both, and it doesn't seem relevant (thinking forward in a future where you can have a multitude of non-Hive connectors, which I understand is not possible in Prestissimo right now).

spershin commented 9 months ago

@tdcmeehan We can potentially create an issue for this and get it tackled by someone at some point.