This PR aims to fix IllegalArgumentException when reading json timestamp type in benchmark.
Write and read json, convert timestamp type to long type instead of string type.
Why are the changes needed?
ORC-1191 Switch the csv format of taxi to parquet and read the timestamp format of parquet, but it is in microseconds format, which is different from the millisecond format of Java's java.sql.Timestamp.
When we write the data into json and then use the scan command, we will get the following error.
java -jar core/target/orc-benchmarks-core-*-uber.jar scan data -format json
Exception in thread "main" java.lang.IllegalArgumentException: Timestamp format must be yyyy-mm-dd hh:mm:ss[.fffffffff]
at java.sql/java.sql.Timestamp.valueOf(Timestamp.java:224)
at org.apache.orc.bench.core.convert.json.JsonReader$TimestampColumnConverter.convert(JsonReader.java:175)
at org.apache.orc.bench.core.convert.json.JsonReader.nextBatch(JsonReader.java:86)
at org.apache.orc.bench.core.convert.ScanVariants.run(ScanVariants.java:92)
at org.apache.orc.bench.core.Driver.main(Driver.java:64)
Because json data of type timestamp is written via java.sql.Timestamp#toString, but reading the data java.sql.Timestamp#valueOf will report an error.
Timestamp ts = new Timestamp(1446341079000000L);
System.out.println(ts);
System.out.println(Timestamp.valueOf(ts.toString()));
47802-09-23 02:50:00.0
Exception in thread "main" java.lang.IllegalArgumentException: Timestamp format must be yyyy-mm-dd hh:mm:ss[.fffffffff]
at java.sql.Timestamp.valueOf(Timestamp.java:237)
What changes were proposed in this pull request?
This PR aims to fix
IllegalArgumentException
when reading json timestamp type in benchmark.Write and read json, convert timestamp type to long type instead of string type.
Why are the changes needed?
ORC-1191 Switch the csv format of taxi to parquet and read the timestamp format of parquet, but it is in microseconds format, which is different from the millisecond format of Java's
java.sql.Timestamp
.taxi source parquet meta
When we write the data into json and then use the scan command, we will get the following error.
Because json data of type timestamp is written via
java.sql.Timestamp#toString
, but reading the datajava.sql.Timestamp#valueOf
will report an error.How was this patch tested?
local test
Was this patch authored or co-authored using generative AI tooling?
No