Open asfimport opened 3 years ago
Weston Pace / @westonpace: Thank you for creating such a detailed test case. I have run your test against pyarrow 2.0.0 and I can confirm I get the same results that you do. Luckily, when I ran your test against the latest code I did not see this error and I confirmed that the full_name.name column aligned with the fruit_name column. We have recently fixed issues related to structs such as ARROW-10493 and my assumption is that you encountered one of those.
We are on the verge of releasing 3.0.0. There is an RC available at (https://bintray.com/apache/arrow/python-rc/3.0.0-rc2#files/python-rc/3.0.0-rc2) if you would like to test this behavior out yourself sooner.
Chen Ming: @westonpace Thank you for the information. And very happy to see 3.0.0 has been released to PyPI this morning. From my quick test with the example data, the issue has been fixed by PyArrow 3.0.0.
We want to do more testing (with our production data), so I would like to keep this Jira in open state for a few more days.
Joris Van den Bossche / @jorisvandenbossche: I think it would be good to still extract a test case from your example to add to the test suite, if possible.
Hi,
We found an out-of-order issue with the 'struct' data type recently, would like to know if you can help to root cause it.
The above code (attached as test_struct_200.py) runs with the following python packages:
Then I use parquet-tools (1.11.1) to read the file, but get the following output:
(BTW, you can also view the parquet file with http://parquet-viewer-online.com/)
The output is supposed to be (refer to test_struct.csv) :
As a comparison, the following code (attached as test_struct_200_flat.py) would generate a parquet file with the same data of test_struct.csv:
I also attached the two parquet files for your references.
Reporter: Chen Ming
Original Issue Attachments:
Note: This issue was originally created as ARROW-11344. Please see the migration documentation for further details.