confluentinc / ksql

The database purpose-built for stream processing applications.
https://ksqldb.io
Other
97 stars 1.04k forks source link

Support serdes in Java client for converting response rows to POJOs #7319

Open vcrfxia opened 3 years ago

vcrfxia commented 3 years ago

Today, the Java client returns response rows (from pull/push queries) as Row/KsqlObject objects. Some users have expressed interest in being able to convert these response objects into POJOs based on the column names/fields in the response objects themselves. Doing so today requires converting the response object into a JSON string (row.asObject().toJsonString()) and then using Jackson (or similar) to deserialize into the desired POJO. This could be avoided if the Java client supported serdes (similar to, e.g., Kafka Streams) to directly deserialize response objects into the desired POJOs.

colinhicks commented 3 years ago

I think it should currently be possible to do something like:

MyObject myObj = KsqlObject.toJsonObject(row.asObject()).mapTo(MyObject.class);

We should confirm, but it looks like this reuses the underlying map holding row data and avoids reserialization. Because JsonObject itself uses Jackson, the target class (e.g. MyObject) can also leverage annotations like @JsonProperty.

We could consider adding a mapTo method to Row to make the above operation more convenient.

I might not be understanding the suggestion to use a custom serde approach. It seems like this doesn't make as much sense here because the wire format is JSON-based.

cdadia commented 3 years ago

@colinhicks What you suggested works but to keep the design inline with all the other Kafka api designs where by there are abstractions of passing serdes over configurations, makes it easy for developers to follow the same patterns. Just a thought.

I have to imagine the JSON wire format is not going to be convenient for most use cases unless all you are trying to do is display the results. For most part you will end up serializing that back into something else.

One way to think about this is ORM designs that return ROWs from a database queries.

colinhicks commented 3 years ago

Hi @cdadia, good to hear from you and thank you for the feedback! Let me try to provide a couple details as we continue the conversation.

We chose a JSON-based format for ksqlDB’s HTTP/2 API for a few reasons centered on ease-of-use and adaptability. It’s relatively simple to write new API clients. It’s also convenient to print results, as you mentioned, via tools like cURL. If you’re interested, KLIP-15 and the design proposal discussion go into more detail.

For users of the Java ksqlDB API client, who in turn use the HTTP/2 API, we certainly want to make it straightforward and efficient to use custom POJOs to represent response data. Providing a convenience method for the logic I showed above will hopefully make it clear that data mapping response rows into the client application’s data model is an intended, natural usage. I think we can shore up the documentation as well.

We can also assess other wire formats in the future, in addition the current JSON-based format. Tradeoffs likely include greater configuration complexity for client and server compared to the current approach.

If you are interested, I am happy to create a GitHub issue to kick off assessment of other wire formats. How does that sound?