Open asfimport opened 4 years ago
Wes McKinney / @wesm: The PLAIN encoding for the boolean type is possibly malformed. I opened PARQUET-1859 about providing better error messages, but here is what the failure is
$ python test.py
Traceback (most recent call last):
File "test.py", line 7, in <module>
pq.read_table(path)
File "/home/wesm/code/arrow/python/pyarrow/parquet.py", line 1539, in read_table
use_pandas_metadata=use_pandas_metadata)
File "/home/wesm/code/arrow/python/pyarrow/parquet.py", line 1264, in read
use_pandas_metadata=use_pandas_metadata)
File "/home/wesm/code/arrow/python/pyarrow/parquet.py", line 707, in read
table = reader.read(**options)
File "/home/wesm/code/arrow/python/pyarrow/parquet.py", line 337, in read
use_threads=use_threads)
File "pyarrow/_parquet.pyx", line 1130, in pyarrow._parquet.ParquetReader.read_all
check_status(self.reader.get()
File "pyarrow/error.pxi", line 100, in pyarrow.lib.check_status
raise IOError(message)
OSError: Unexpected end of stream: Failed to decode 1000000 bits for boolean PLAIN encoding only decoded 2048
In ../src/parquet/arrow/reader.cc, line 844, code: final_status
Can this file be read by the Java library?
Novice: Did you mean Rust? :)
I haven't tried, my workflow is write using Rust and read from Python.
Wes McKinney / @wesm: Yes it looks like the file written by Rust is malformed. That two independent implementations fail is good evidence of that.
ii: This is blocking me pretty hard right now, especially since I can't work around it by setting my boolean columns to use RLE because pyarrow doesn't seem to support that encoding.
Is there anything I can do to help? I've tried dumping the parquet file generated by my Rust code using parquet-tools cat -j
and it seems to work fine, including all the boolean values.
Here is the error I got:
Pyarrow:
fastparquet:
The corresponding Rust code is:
Reporter: Novice
Related issues:
Original Issue Attachments:
Note: This issue was originally created as PARQUET-1858. Please see the migration documentation for further details.