apache / arrow-rs

Official Rust implementation of Apache Arrow
https://arrow.apache.org/
Apache License 2.0
2.62k stars 802 forks source link

Read nested Parquet 2-level lists correctly #6757

Open etseidl opened 2 days ago

etseidl commented 2 days ago

Which issue does this PR close?

Closes #6756.

Rationale for this change

See issue.

What changes are included in this PR?

Modifies both the arrow and record readers to check for nested lists before triggering the rule that repeated groups named "array" are treated as list<OneTuple>.

Are there any user-facing changes?

Changes the interpretation of some legacy schemas.

etseidl commented 16 hours ago

Please fix CI 👍

I wish I could 😅. It appears that test is borked for everyone due to some upstream shenanigans.

And now fixed by #6745