confluentinc / ksql

The database purpose-built for stream processing applications.
https://ksqldb.io
Other
117 stars 1.04k forks source link

Support for JSON Lines #3324

Open jpelletier-dad opened 5 years ago

jpelletier-dad commented 5 years ago

It be great if there were support for valid json line syntax http://jsonlines.org/examples/

jpelletier-dad commented 5 years ago

We're directly consuming from market vendors who send the data as json-l. We'd like to do as little transformation as possible and even simply splitting each message into a json blob by line since json-l supports structures like

# no outer object
[ 'hi', 'there']

and

# inconsistent list element types
['hi',0,['a',1]]
big-andy-coates commented 5 years ago

Hi @jpelletier-dad,

[ "hi", "there" ]

Something like this is supported with syntax such as:

CREATE STREAM INPUT (foo ARRAY<STRING>) WITH (WRAP_SINGLE_VALUE=false, ...);

This will make the JSON array available in the column named foo.

Support for JSON arrays containing objects of different types, e.g.

["h",0,["a",1]]

Is not currently supported. While this is valid JSON, KSQL does not yet have a type that could express this. Effectively, what we'd need is some kind of ARRAY<ANY> type, or maybe native support for JSON data.

Thanks for raising this request!