jorgecarleitao / arrow2

Transmute-free Rust library to work with the Arrow format
Apache License 2.0
1.06k stars 222 forks source link

Parse PrimitiveLogicalType::Unknown as Arrow Null DataType #1563

Open jaychia opened 1 year ago

jaychia commented 1 year ago

This PR adds logic to parse any Parquet fields with Unknown LogicalType as the Arrow Null DataType.

codecov[bot] commented 1 year ago

Codecov Report

Patch coverage: 93.75% and project coverage change: -0.01% :warning:

Comparison is base (fb7b5fe) 83.06% compared to head (417c35c) 83.05%. Report is 1 commits behind head on main.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #1563 +/- ## ========================================== - Coverage 83.06% 83.05% -0.01% ========================================== Files 391 391 Lines 42889 42895 +6 ========================================== + Hits 35626 35628 +2 - Misses 7263 7267 +4 ``` | [Files Changed](https://app.codecov.io/gh/jorgecarleitao/arrow2/pull/1563?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Jorge+Leitao) | Coverage Δ | | |---|---|---| | [src/io/parquet/read/schema/convert.rs](https://app.codecov.io/gh/jorgecarleitao/arrow2/pull/1563?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Jorge+Leitao#diff-c3JjL2lvL3BhcnF1ZXQvcmVhZC9zY2hlbWEvY29udmVydC5ycw==) | `94.60% <80.00%> (-0.09%)` | :arrow_down: | | [src/compute/cast/mod.rs](https://app.codecov.io/gh/jorgecarleitao/arrow2/pull/1563?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Jorge+Leitao#diff-c3JjL2NvbXB1dGUvY2FzdC9tb2QucnM=) | `90.84% <100.00%> (+0.01%)` | :arrow_up: | ... and [4 files with indirect coverage changes](https://app.codecov.io/gh/jorgecarleitao/arrow2/pull/1563/indirect-changes?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Jorge+Leitao)

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

jaychia commented 1 year ago

Also, no unit tests on this yet but would love pointers on how to best test this change.

Should I add a new column in the files generated by parquet_integration/write_parquet.py?