apache / pinot

Apache Pinot - A realtime distributed OLAP datastore
https://pinot.apache.org/
Apache License 2.0
5.26k stars 1.23k forks source link

Documentation issues for JSON columns #8586

Closed diogobaeder closed 2 years ago

diogobaeder commented 2 years ago

Hi folks,

There's a few issues in the documentation for handling JSON columns.

In the documentation for handling JSON columns with JSON indexing, the configuration there is missing a top-level ingestionConfig configuration on the table config, and only inside ingestionConfig is where we should put transformConfigs. Also, that configuration part in the docs could be better formatted to have more consistent indentation.

And I also think the documentation should be using the JSON column type, not STRING; Even though both might work, using JSON is more consistent and is actually the type being used in the file that contains the "full spec": https://github.com/apache/pinot/blob/master/pinot-tools/src/main/resources/examples/stream/meetupRsvp/json_meetupRsvp_schema.json

One more thing that would be worth mentioning in the documentation is that the column name in the source data that is used for ingestion has to be different from the destination column name; For example, if I want column foo in Pinot as JSON, then in my source I should use something different like foo_source. Ideally though, it would be awesome if Pinot could just take care of such fields without us having to have different names and having to define a transformation function - Pinot could use jsonFormat by default on the fields that use JSON type, just to reduce configuration boilerplate.

Thanks! Diogo

Jackie-Jiang commented 2 years ago

@diogobaeder Thanks for pointing out the issues. Would you like to help us improve the documentation for JSON? You may find the source code under this repo: https://github.com/pinot-contrib/pinot-docs

diogobaeder commented 2 years ago

@Jackie-Jiang sure! I'll fork it and send a PR, thanks :-)

mneedham commented 2 years ago

@diogobaeder have applied your feedback to the docs - https://docs.pinot.apache.org/basics/data-import/complex-type

diogobaeder commented 2 years ago

Awesome, thanks @mneedham ! :-)