jorgecarleitao / arrow2

Transmute-free Rust library to work with the Arrow format
Apache License 2.0
1.07k stars 221 forks source link

Add `array::downcast_ref` and `array::downcast_mut` #1566

Open aldanor opened 10 months ago

aldanor commented 10 months ago

This introduces array::downcast_ref() and array::downcast_mut() helpers, providing a nicer downcasting interface which makes use of errors, so you could e.g. (in a result-context)

downcast_ref(&my_arr)?.do_stuff()

Downstream, if you're using arrow2, chances are you're converting arrow2::error::Error into your errors already anyway somewhere along the line, and working with Result is nicer then Option most of the time, because otherwise you'll have to .ok_or_else() on every downcast call.

codecov[bot] commented 10 months ago

Codecov Report

Patch coverage: 85.00% and project coverage change: -0.01% :warning:

Comparison is base (fb7b5fe) 83.06% compared to head (2a6ddfc) 83.05%. Report is 1 commits behind head on main.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #1566 +/- ## ========================================== - Coverage 83.06% 83.05% -0.01% ========================================== Files 391 391 Lines 42889 42930 +41 ========================================== + Hits 35626 35657 +31 - Misses 7263 7273 +10 ``` | [Files Changed](https://app.codecov.io/gh/jorgecarleitao/arrow2/pull/1566?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Jorge+Leitao) | Coverage Δ | | |---|---|---| | [src/array/mod.rs](https://app.codecov.io/gh/jorgecarleitao/arrow2/pull/1566?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Jorge+Leitao#diff-c3JjL2FycmF5L21vZC5ycw==) | `80.36% <85.00%> (+0.64%)` | :arrow_up: | ... and [7 files with indirect coverage changes](https://app.codecov.io/gh/jorgecarleitao/arrow2/pull/1566/indirect-changes?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Jorge+Leitao)

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

aldanor commented 9 months ago

I am not sure about this. You can say something like this for any Option returning method. To me this feels something that needs to be resolved by the consumer of the arrow2 crate.

It's your call, but in my recent experiences as a daily arrow2 user, when you're dealing when reading lots of box-arrays from various sources (e.g. dealing with struct-like data and the like), it's downcasts all over the place. If you want to avoid unwraps, you will have to .ok_or_else() every single line to provide context on what just happened. Or just litter the code with unwraps because it's too much hassle otherwise.

The core difference between "just let the user figure it out, it's an option" here is that the array itself (both Array and MutableArray in fact) actually have access to the context, both "what you were trying to cast to" and "what the actual datatype is", because of .data_type(). In case of Any::downcast_*(), you don't have access to any additional context, so the only thing you can return is an option indeed. So, we can assemble a nice error message just once here; chances are, if the user wanted to assemble an error message, they would do almost precisely the same (i.e., use the actual data type of the underlying array). This way, it's just ? in any Result-like contexts as long as your error type is convertible from arrow2's (which it probably already is). There's not many other Option-yielding core APIs aside from downcasts in the library, this is the only bit really sticking out.

A minor side effect here is that it unifies downcasting of both array/mutable-array, but that's not the most important part.