Open sebffischer opened 3 months ago
So the poll basically ended 52:48 in favor of the feature.
Quoting myself here from the discussion of the thread:
I think ideally, indexing beyond bounds should already err, i.e.
(1:10)[11]
should not returnNA
but an error.If one agrees with that, I think the problem with indexing with
NA
is that it is unclear whether theNA
is a value within the range of indices of a vector. I.e. if I do(1:10)[NA]
, theNA
could be a value > 10 which should result in an error.
Hi, just to add something to the discussion, I think R's behaviour has two advantages: first, it makes the length of the subset predictable (it will always be the same length of the subsetting vector, regardless of how many NA's there are), and second, it avoids dropping data when subsetting a vector with a function of itself, e.g.
chicken_weights <- c(1, 4, NA, 2, 3, 10)
heavy_chickens <- a[a >= 3]
If NA
s weren't kept, heavy_chickens
would have no missing data, which is misleading. I'm not on mastodon so I'm not sure if this issue was raised before.
Your example is an important use-case to keep in mind and was not yet mentioned on mastodon, thanks for raising it!
I agree, that NA
s should probably not be dropped, but I think it might make sense to throw an error if they are present in a subset.
This would mean that the user has to specify explicitly how to handle the NA
s, which I think might be a benefit of this model and cause more careful handling of missing values. R-like behavior can still be achieved by including a check for missingness in the subset.
chicken_weights <- c(1, 4, NA, 2, 3, 10)
heavy_chickens <- a[is.na(a) | a >= 3]
I see, erroring out would make sense :+1:
Currently we have
which is in agreement with R, but I am not so sure whether this is something that should be possible. Especially for developers this does not seem so nice, but I am also not so sure whether this is something one wants when working with the repl interactively.
I think for these types of questions it is also nice to just ask people, so I started a poll on mastodon the only problem being that I have no followers 😅 , but let's see.