This PR aims to fix IllegalArgumentException when reading json timestamp type in bechmark.
Why are the changes needed?
ORC-1191 Switch the csv format of taxi to parquet and read the timestamp format of parquet, but it is in microseconds format, which is different from the millisecond format of Java's java.sql.Timestamp.
When we write the data into json and then use the scan command, we will get the following error.
java -jar core/target/orc-benchmarks-core-*-uber.jar scan data -format json
Exception in thread "main" java.lang.IllegalArgumentException: Timestamp format must be yyyy-mm-dd hh:mm:ss[.fffffffff]
at java.sql/java.sql.Timestamp.valueOf(Timestamp.java:224)
at org.apache.orc.bench.core.convert.json.JsonReader$TimestampColumnConverter.convert(JsonReader.java:175)
at org.apache.orc.bench.core.convert.json.JsonReader.nextBatch(JsonReader.java:86)
at org.apache.orc.bench.core.convert.ScanVariants.run(ScanVariants.java:92)
at org.apache.orc.bench.core.Driver.main(Driver.java:64)
If we use orc-tools to dump the generated ORC file metadata, the timestamp data is also incorrect.
What changes were proposed in this pull request?
This PR aims to fix
IllegalArgumentException
when reading json timestamp type in bechmark.Why are the changes needed?
ORC-1191 Switch the csv format of taxi to parquet and read the timestamp format of parquet, but it is in microseconds format, which is different from the millisecond format of Java's
java.sql.Timestamp
.taxi source parquet meta
When we write the data into json and then use the scan command, we will get the following error.
If we use orc-tools to dump the generated ORC file metadata, the timestamp data is also incorrect.
If we use parquet-cli to dump the generated parquet metadata, we will have the same problem.
https://github.com/apache/orc/blob/952b4792f20eaf1bb63c0eb7319e03b9c3d7a3f1/java/bench/core/src/java/org/apache/orc/bench/core/convert/avro/AvroSchemaUtils.java#L92-L95
How was this patch tested?
local test
output
Was this patch authored or co-authored using generative AI tooling?
No