Open ryanruaneyougov opened 2 years ago
The java.lang.UnsupportedOperationException: Unsupported data type : BYTES
seems due to missing case for BYTES in dictionary creator. @Jackie-Jiang SegmentDictionaryCreator.indexOfMV
doesn't have BYTES
case handled. Is that by design?
The other errors are due to casting in PreIndexStatsCollector
classes. e.g. long value = (long) entry;
which throws error when entry is of type Timestamp
. This definitely seems like a bug to me. Will work on the fix.
Initially we don't support BOOLEAN
, TIMESTAMP
, BYTES
as MV, and the support is added recently. Some paths might be missed, and we should fix them.
cc @richardstartin
I have found that I can ingest using JSON all types as multi-valued dimension columns with the exception of BOOLEAN, TIMESTAMP, and BYTES. I believe that JSON_ARRAY isn't a valid type, but I wasn't sure about BYTES_ARRAY. If the muli-valued versions of those types are removed from the schema and data files below, ingestion succeeds and I can inspect the table in the pinot browser. If anyone is about and can shed some light, I would be very appreciative.
Json Ingestion
For BYTES_ARRAY I get:
For BOOLEAN_ARRAY I get:
For TIMESTAMP_ARRAY I get:
Here is my cluster:
Here is my schema:
Here is my table:
Here is my ingestion job:
Here is my data:
CSV Ingestion
Using the following ingestion job, schema, and csv file, an INT was ingested to an INT multi-value column:
Ingestion job:
Schema:
CSV:
However, the aforementioned error from json ingestion presents when tried for booleans:
Error:
Schema:
CSV:
and