moodymudskipper / inops

Infix Operators for Detection, Subsetting and Replacement
GNU General Public License v3.0
40 stars 0 forks source link

first try to simplify the code #8

Closed karoliskoncevicius closed 5 years ago

karoliskoncevicius commented 5 years ago

@moodymudskipper attempted to simplify the %in% functions and extract the common mechanism. Seems to be working for me, but please do check when you have time, if I didn't remove something you think should be left in.

For now I removed the %subset% functions - as the name is not yet decided. But these will be trivial to add back. Also removed the formula interface (for now).

Also removed the case where value was a function. I think it can fail in some cases. For example:

x <- 1:10
mean <- mean(x)
x %in[]% c(2,5) <- mean  # ambiguous if it should be a function or a variable...

I really liked your range() solution. Except it was missing na.rm=TRUE for cases where user gives something with NA in it.

Also %in{}% made it so that when x has NA - it will remain as NA, not FALSE after %in{}%

moodymudskipper commented 5 years ago

cool! I'll take a look.

About na.rm in range that's fine, I was hesitant about it and opted to keep it to default, thinking the user could use na.omit on it (after all if a value of the range is unknown, wether the lhs is in the range is unknown too), but I also believe that in most cases we want to omit the NAs so both choices are ok by me.

removing functions along with formulas : yes, when i agreed to remove formulas I meant to remove functions as well.

keeping NAs and not making them FALSE : OK too, but we'll have to make sure to document it as it differs from %in% 's behavior.

karoliskoncevicius commented 5 years ago

About na.rm in range: I am not too sure about which way to go. If the range would only accept 2 values - then I would def be for keeping NA. However when we allow range to be a whole numeric vector and call range() on it - I think users will start throwing whole vectors as ranges, and they might contain NAs. In those cases they should be removed... Maybe we should actually get back from calling "range" on the specified interval and let users call range() themselves? Not too sure.

Regarding consistency with %in% - I am aiming to make %in{}% be consistent with == operators (and at the same time with other %in[]% since they use comparisons within.

karoliskoncevicius commented 5 years ago

Surprise to me: any doesn't work on data.frame

any(data.frame(A=TRUE, B=FALSE))
## Error in FUN(X[[i]], ...) :
##   only defined on a data frame with all numeric variables

But works on numeric data.frame:

any(data.frame(A=0, B=1))
## [1] TRUE
##
## Warning message:
## In any(c(0, 1), na.rm = FALSE) :
##   coercing argument of type 'double' to logical
moodymudskipper commented 5 years ago

I believe that to be consistent with comparison operators, and return a matrix, %in{}% should be more like:

`%in{}%` <- function(x, table) {
  if (is.atomic(x) || all(lengths(x) == 1)) {
    res <- x %in% table
  } else if (is.data.frame(x) || is.matrix(x)){
    res <- apply(x, 2, `%in%`, table)
  } else {
    stop("x should be an atomic vector, a flat list, a data frame or a matrix")
  }
  res[is.na(x)] <- NA
  res
}

note that I added some tests because comparison operators fail on lists, unless they are flat lists (all length one elements).

apply converts to matrix first though, which might be problematic, so probably better consider the data frame casese parately and convert the result to matrix.

moodymudskipper commented 5 years ago

I like the idea of making %like% consistent with the others (regarding data.frame input), but the definition I had taken was borrowed from data.table, maybe we could name it %like{}% ? It doesn't (and didnt) support a vector in the rhs though, which would be good to a bit more like %in{}%.

It suffers, as do other functions, from the fact they don't output matrices to be consistent with comparison operators but I think we talked about that after you committed this.

So this all might be better discussed within issues.

I will accept your proposals because it goes all in the right direction in any case. Thanks for the good work. I apologise for not reacting faster it has been hard to find the time (and I'm ashamed to admit I don't master git for collaboration so I had to copy paste these functions to test them :) ).

I hope to get better at it!

karoliskoncevicius commented 5 years ago

Oh no worries, I am no git master myself :)

Typically what I do when changing something:

git checkout -b dev-feature . # this will create a new branch "dev-feature"

Then work, work work. Then

git commit -m "bla bla"
git push

After that have to go to github and issue a pull-request.

On your side I think you can just pull the branch "dev-feature" from the github and see all the changes there.