Support for API path parameters

AdrianVasiliu commented 4 months ago

Among the different types of API parameters, it appears the path parameters (for instance /customers/{customerId}/orders/{orderId}) are currently not supported. At least it looks so by looking at the code, tests, and documentation.

The URL can be augmented to provide the path segments as a const/literals statically, at connector configuration time. But, in most cases, the path parameters need to be fed from table columns.

Path parameters being by definition required (not optional), the current limitation makes it impossible to use the connector for many APIs.

A solution would need to allow the coexistence of

query and path parameters for GET, and
query, path, and request body parameters for POST.

davidradl commented 4 months ago

@AdrianVasiliu interesting. How are you wanting the Flink experience to look for the API call /customers/{customerId}/orders/{orderId} ?

It seems to me that the lookups supports being

query and path parameters for GET, and
query, path, and request body parameters for POST.

equivalent would be similar to specifying a value for {orderId} - which it cannot do at the moment. What are you thinking about how you might dynamically specify the {customerId} as well. Ideally some Flink SQL and the additional customizations would be useful if you have thought about them. Here are some examples I was thinking about. I could see how examples 1 and 2 might be possible, but am wondering how we could do example 3, if customerId is not in the predicates or resultset. I assume we would need customerId as a column in an existing table we can reference, maybe this was discovered as the result of a lookup. Is example 3 representative of any of the APIs you need?

Example 1: CREATE TABLE Orders ( orderId STRING, Amount INT ) WITH ( 'connector' = 'rest-lookup', 'format' = 'json', 'url' = 'http://localhost:8080/client/orders/{orderId}’ 'asyncPolling' = 'true'

Example 2:

CREATE TABLE Orders ( customerId STRING, orderId STRING, Amount INT

) WITH ( 'connector' = 'rest-lookup', 'format' = 'json', 'url' = 'http://localhost:8080/client/customers/{customerId}/orders/{orderId}’, 'asyncPolling' = 'true'

Example 3:

CREATE TABLE Orders ( orderId STRING, Amount INT ) WITH ( 'connector' = 'rest-lookup', 'format' = 'json', 'url' = 'http://localhost:8080/client/customers/{customerId}/orders/{orderId}’, 'asyncPolling' = 'true'

When you say coexistence - do you mean in the same query. I suspect a join condition would be get, post or path. Do you need multiple join conditions?

@kristoffSC is this something you gave considered? If so, did you have any thoughts on designs?

AdrianVasiliu commented 4 months ago

@davidradl

How are you wanting the Flink experience to look for the API call

As in your 3 examples :-)

When you say coexistence - do you mean in the same query. I suspect a join condition would be get, post or path.

Yes, in the same query, that is the connector should support common use-cases such as: GET with both path and query parameters POST with path + query and/or body parameters.

Do you need multiple join conditions?

I'd expect a JOIN with a ON clause such as

JOIN ... ON `lookupTable`.`customerId`= `inputTable`.`id` AND `lookupTable`.`otherParam`= `inputTable`.`otherParam` etc.

Remarks:

For the connector to know if a parameter in the ON clause is a query vs path parameter, the easiest would be to match its name with the path parameters in the URL provided to the connector.
But an ambiguity can exist, however, if the same parameter name exists as path and as query parameter. Admittedly, that's a corner case, infrequent in practice, although legal per OpenAPI spec.
For POST, it's also about distinguishing between path/query parameters and request body parameters. For them, a custom implementation of LookupQueryCreator, consuming a specific connector configuration parameter, allows the connector to figure it out.
A generic, out-of-the-box mechanism to support query/path/body parameters would be ideal. Otherwise, any working solution is still welcome. (Header parameters with values mapped to table columns would be nice to have, but not much practical use-cases for it, it's generally a const/literal value, and this is already supported by the connector.)

davidradl commented 4 months ago

@AdrianVasiliu thanks for the extra information. I am thinking that is RDB we are looking at tables like this.

customers id (PK)INTEGER nameTEXT emailTEXT

orders id (PK)INTEGER customer_idINTEGER itemTEXT price

Conceptually we would have the same exposed as as API.

in your example we would also need an input table (maybe defined with aKafka connector) that would correlate columns to orderId and customerId. I guess the join would be something like

JOIN ... ON `customers`.`customerId`= `inputTable`.`aParam` AND `orders`.`id`= `inputTable`.`otherParam`

davidradl commented 3 months ago

@kristoffSC please could you assign this one to me .

getindata / flink-http-connector

Support for API path parameters #74