apache / arrow

Apache Arrow is the universal columnar format and multi-language toolbox for fast data interchange and in-memory analytics
https://arrow.apache.org/
Apache License 2.0
14.48k stars 3.52k forks source link

[Java] Map cannot be written to a Parquet file using DatasetFileWriter #38250

Open jduo opened 1 year ago

jduo commented 1 year ago

Describe the bug, including details regarding any error messages, version, and platform.

Trying to write a map column to a parquet file using DatasetFileWriter results in the following error: Runtime java.lang.IllegalArgumentException: not all nodes and buffers were consumed. nodes: [ArrowFieldNode [length=2, nullCount=1], ArrowFieldNode [length=0, nullCount=0], ArrowFieldNode [length=0, nullCount=0]] buffers: [ArrowBuf[509], address:140461672595576, capacity:0, ArrowBuf[510], address:140461672595576, capacity:1, ArrowBuf[511], address:140461672595584, capacity:8, ArrowBuf[512], address:140461672595592, capacity:0, ArrowBuf[513], address:140461672595592, capacity:0, ArrowBuf[514], address:140461672595592, capacity:0] at org.apache.arrow.vector.VectorLoader.load(VectorLoader.java:89) at org.apache.arrow.vector.ipc.ArrowReader.loadRecordBatch(ArrowReader.java:220) at org.apache.arrow.vector.ipc.ArrowStreamReader.loadNextBatch(ArrowStreamReader.java:161) at org.apache.arrow.c.ArrayStreamExporter$ExportedArrayStreamPrivateData.getNext(ArrayStreamExporter.java:72)

This happens when the data-to-be-written comes from an ArrowFileReader or ArrowStreamReader. This problem may affect more file formats than parquet (and might actually be related to the DatasetFileWriter in general).

Component(s)

Java, Parquet

vibhatha commented 1 year ago

@jduo is there a script that we could use to reproduce this error?

jduo commented 1 year ago

I've attached a test case that shows this problem. https://github.com/jduo/arrow/blob/master/java/dataset/src/test/java/org/apache/arrow/dataset/file/TestDatasetWriterMap.java

vibhatha commented 1 year ago

@jduo I can reproduce this error. I will look into this.

vibhatha commented 1 year ago

take

jduo commented 1 year ago

Thanks @vibhatha . Just wondering do you still need the example code that's been posted?

vibhatha commented 1 year ago

@jduo all good, I created branch and testing further. But would you mind if I use this code as the base for testing and further investigation.

jduo commented 1 year ago

Yes you are welcome to use the example code as you see fit.

vibhatha commented 1 year ago

Thank you @jduo

c3-qichen commented 9 months ago

any update on this issue? I run into the same error recently. Thanks.