Open mapleFU opened 3 months ago
@alamb I'm willing to try this but I'm not so familar with parquet-rs. Do you think this would better be checked in ColumnChunkMetaData::byte_range
or ColumnChunkMetaData::from_thrift
? Or this is already checked, we don't require this?
@alamb I'm willing to try this but I'm not so familar with parquet-rs. Do you think this would better be checked in
ColumnChunkMetaData::byte_range
orColumnChunkMetaData::from_thrift
? Or this is already checked, we don't require this?
I am not sure -- I think the first thing we should do is get a reproducer. Let me see if I can whip up some tests
Describe alternatives you've considered
That looks good to me, FWIW
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
See: https://github.com/apache/parquet-testing/pull/58#issuecomment-2290985490
When reading a corrupt file, currently, arrow-rs would have:
Would we better check the range here?
Describe the solution you'd like
Checking the range when building the group reader or in "byte_range()"
Describe alternatives you've considered
Additional context