apache / parquet-java

Apache Parquet Java
https://parquet.apache.org/
Apache License 2.0
2.62k stars 1.41k forks source link

parquet-pig should not throw an exception reading a timestamp from impala file #1741

Open asfimport opened 9 years ago

asfimport commented 9 years ago

Hi, I need to load impala table with timestamp columns with pig. I think loading timestamp cols is still a NYI feature. Waiting for #218 TBD, pig-parquet should not throw an exception but at least should bind Int96 into a bytearray (users can write their own UDF to convert it into pig datetime type). Is there anyone working on that?

Reporter: gpolaert

Note: This issue was originally created as PARQUET-195. Please see the migration documentation for further details.

asfimport commented 9 years ago

Ryan Blue / @rdblue: Could you post the exception you're getting from Pig when you try to read a timestamp? We've added support for the underlying data type (int96 as you noted) to show up as a byte array, so I think this might be a bug rather than a NYI feature.

asfimport commented 9 years ago

gpolaert: I've just patched the 1.5.0 branch for my own purposes; it's the same thing with the master and 1.6.0rc4 branches . I don't have my code right now but the exception comes from PigSchemaConverter class (line 238).

The patch simply return a DataType.BYTEARRAY instead an exception.

asfimport commented 9 years ago

Ryan Blue / @rdblue: Okay, so the int96 support is only in the common libraries and Avro, but not Pig. Would you mind posting your fix as a pull request so we can fix master? Thanks!

asfimport commented 6 years ago

Cesar Delgado / @beettlle: [~gpolaert] it looks like a change similar to what you've described is already in Parquet master here. Does that change work for you?