r-lib / vctrs

Generic programming with typed R vectors
https://vctrs.r-lib.org
Other
290 stars 66 forks source link

feature request: `vec_proxy_na()` #1925

Closed khusmann closed 2 months ago

khusmann commented 7 months ago

Hello, thanks for this awesome library!

I'm working on a package for working with "missing reason" data in R.

I'm experimenting with vctrs to see if I can build a generic Result<Value, MissingReason> type vector. The goal is for it to act like transparently like the value type, but store reasons for missing values as an attribute. In my early experimentation, I've come really close.

It's 99% of the way there, but it just has one little quirk: I want to be able to define what I consider to be a "missing value" for the vector.

Right now, as you know, vec_detect_missing() and friends all use vec_proxy_equal() to determine what is considered missing. This creates a problem for me because I want equality to test the equality of missing reasons (e.g. na("Reason 1") == na("Reason 1") && na("Reason 1") != na("Reason 2")), but then with its current behavior this means only rows missing values AND reasons are considered NA by vec_detect_missing().

I can sort of work around this by providing custom definitions for is.na(), but this is only a surface level fix (I want something that'd properly propagate into the tidyverse like tidyr::replace_na)

I want to propose a new proxy for this: vec_proxy_na(). By default, it'd just call vec_proxy_equal() (so it would be 100% backwards compatible), but then when overridden it would allow developers like me a hook into the missingness behavior of their vctrs.

That said, if there's an alternate way to hook into vec_detect_missing(), please let me know!

khusmann commented 2 months ago

No need to implement, see discussion in PR: https://github.com/r-lib/vctrs/pull/1928#issuecomment-2292034834