apache / pinot

Apache Pinot - A realtime distributed OLAP datastore
https://pinot.apache.org/
Apache License 2.0
5.42k stars 1.27k forks source link

Ingest and query time in RFC3339 #7276

Open mbecker opened 3 years ago

mbecker commented 3 years ago

Hi,

I'm having a time format in the Kafka Messages as follows:

"_time": "2021-08-10T07:08:08.873Z",

I'm using the following schema for the date:

"dateTimeFieldSpecs": [
      {
        "name": "_time",
        "dataType": "STRING",
        "format": "1:MILLISECONDS:SIMPLE_DATE_FORMAT:yyyy-MM-dd'T'HH:mm:ss.SSS'Z'",
        "granularity": "1:MINUTES"
      }
    ],

(I'm using a Java format here)

How can I check that the ingest in Pinot is working with the specified date time field?

Then in a Presto SQL the DATETIMECONVERT doesn't work as expected and returns an error:

SELECT DATETIMECONVERT("_time", '1:MILLISECONDS:SIMPLE_DATE_FORMAT:yyyy-mm-ddTHH:MM:SS.%fZ', '1:DAYS:SIMPLE_DATE_FORMAT:yyyyMMdd', '1:DAYS'),
       max(value_number) AS max_1
FROM abc.xyz
LIMIT 1000;

(Here I'm using a python format? as described in superset)

The returned error is as follows:

pinot error: must be real number, not dict

How can I use the column "_time" with the RFC 3339 format as the date time field in Presto / Superset?

Thanks for your help and cheers!

xiangfu0 commented 3 years ago

@Jackie-Jiang @npawar shall we consider providing built-in formats to avoid those date format conversion difficulties?

Jackie-Jiang commented 3 years ago

The TIMESTAMP data type always use yyyy-MM-dd HH:mm:ss.SSS, but this won't avoid the time format conversion if we need the result to be in another format. @mbecker From the Presto SQL query, can you try setting the input format (second argument) to match the field spec format?

mbecker commented 3 years ago

Hi,

thanks for your help!

Sorry, in my original post I mixed some tools like using for example "Presto" (vs. only "Superset").

So, my original use case is that the Kafka Message has this attribute "time" with the RFC 3339 format. I'm ingesting it as a string type and creating the "Pinot" table as described above. In "Superset" I'm defining the dataset and setting the attribute "time" as a time-field with the DATETIME format as described above and with the new format yyyy-MM-dd HH:mm:ss.SSS as commented by @Jackie-Jiang .

Screenshot 2021-08-15 at 09 26 15

Because this is not the repo for "Superset", I'm just asking how to check that the ingestion and the table-format with a STRING-field for the dateTimeFieldSpecs in Pinot is correct :-)

Of course, any other help would be great too ;-)

Thank you very much.

mayankshriv commented 3 years ago

@mbecker If you are just trying to validate if the event is ingested in Pinot correctly and the time format is store correctly in Pinot, then you can simply run a pinot query to do so:

select _time from <tableName>