moodymudskipper / inops

Infix Operators for Detection, Subsetting and Replacement
GNU General Public License v3.0
40 stars 0 forks source link

consistency between in_check ops and equality/comparison ops ? #13

Closed moodymudskipper closed 4 years ago

moodymudskipper commented 4 years ago

do we agree that they should be consistent ?

if so we still have some work to do on them :

> test_chr <- c(names(iris), NA)
> test_num <- c(1:5,NA)
> test_list1 <- c(as.list(c(1:5,NA)),
+                 as.list(c(letters[1:5], NA)), 
+                 as.list(factor(c(letters[1:5], NA)))) 
> test_list2 <- list(1:3, c(4:5,NA) , 
+                    letters[1:3], c(letters[4:5], NA), 
+                    factor(c(letters[1:5], NA)))
> test_df <- data.frame(
+   a = 1:3, b = c(4:5,NA), c = letters[1:3], d = c(letters[4:5], NA), 
+   e = factor(letters[1:3]), f = factor(c(letters[4:5], NA)),stringsAsFactors = FALSE)
> 
> test_chr == 3
[1] FALSE FALSE FALSE FALSE FALSE    NA
> test_list1 == 3
 [1] FALSE FALSE  TRUE FALSE FALSE    NA    NA    NA    NA    NA    NA    NA FALSE FALSE  TRUE FALSE FALSE    NA
Warning messages:
1: NAs introduced by coercion 
2: NAs introduced by coercion 
3: NAs introduced by coercion 
4: NAs introduced by coercion 
5: NAs introduced by coercion 
> test_list2 == 3
Error: (list) object cannot be coerced to type 'double'
> test_df == 3
         a     b     c     d     e     f
[1,] FALSE FALSE FALSE FALSE FALSE FALSE
[2,] FALSE FALSE FALSE FALSE FALSE FALSE
[3,]  TRUE    NA FALSE    NA FALSE    NA
> 
> 
> test_chr %in{}% 3
[1] FALSE FALSE FALSE FALSE FALSE    NA
> test_list1 %in{}% 3
 [1] FALSE FALSE  TRUE FALSE FALSE    NA FALSE FALSE FALSE FALSE FALSE    NA FALSE FALSE FALSE FALSE FALSE    NA
> test_list2 %in{}% 3
[[1]]
[1] FALSE FALSE  TRUE

[[2]]
[1] FALSE FALSE FALSE

[[3]]
[1] FALSE FALSE FALSE

[[4]]
[1] FALSE FALSE FALSE

[[5]]
[1] FALSE FALSE FALSE FALSE FALSE FALSE

> test_df %in{}% 3
         a     b     c     d     e     f
[1,] FALSE FALSE FALSE FALSE FALSE FALSE
[2,] FALSE FALSE FALSE FALSE FALSE FALSE
[3,]  TRUE    NA FALSE    NA FALSE    NA
> 
> 
> test_chr %in% 3
[1] FALSE FALSE FALSE FALSE FALSE FALSE
> test_list1 %in% 3
 [1] FALSE FALSE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE FALSE FALSE FALSE
> test_list2 %in% 3
[1] FALSE FALSE FALSE FALSE FALSE
> test_df %in% 3
[1] FALSE FALSE FALSE FALSE FALSE FALSE
moodymudskipper commented 4 years ago

Also what would be the expected output for list(list(1,2)) %in{}% list(list(1,2), 3) ?

I believe we should unlist by default the rhs, it seems to be what range() does anyway :

range(list(1,list(3,4)))
#> [1] 1 4

Then I think failing like == does is good enough so we avoid opening a new can of worms and can aim to have everything consistent with == rather than %in%, does that sound right ?

list(list(1,2)) %in% list(list(1,2), 3)
#> [1] TRUE
list(list(1,2)) == list(list(1,2), 3)
#> Error in list(list(1, 2)) == list(list(1, 2), 3): comparaison de ces types non implémentée
list(1,2) %in% list(1, 3)
#> [1]  TRUE FALSE
list(1,2) == list(1, 3)
#> Error in list(1, 2) == list(1, 3): comparaison de ces types non implémentée
c(1,2) == list(1, 3)
#> [1]  TRUE FALSE
c(1,2) %in% list(1, 3)
#> [1]  TRUE FALSE
list(1,2) == c(1, 3)
#> [1]  TRUE FALSE
list(1,2) %in% c(1, 3)
#> [1]  TRUE FALSE

Created on 2019-08-18 by the reprex package (v0.3.0)

moodymudskipper commented 4 years ago

I've pushed a few changes, including making %in{}% consistent, other check ops were ok, and I made tests for those

karoliskoncevicius commented 4 years ago

Making it consistent with == sounds about right to me!.

Also I am wondering if there is a way to make them adhere to "group generics". Specifically "Group Ops" in ?groupGeneric

moodymudskipper commented 4 years ago

Group "Ops": ... (arithmetic ops) "==", "!=", "<", "<=", ">=", ">" ... (on unary ops) The classes of both arguments are considered in dispatching any member of this group. For each argument its vector of classes is examined to see if there is a matching specific (preferred) or Ops method. If a method is found for just one argument or the same method is found for both, it is used. If different methods are found, there is a warning about ‘incompatible methods’: in that case or if no method is found for either argument the internal method is used.

I think group generics allow identical(iris < 3, 3 > iris) to be TRUE, I think it would be confusing in our case, you would want "^foo" %in~% x to work as x %in~% "^foo" ? Or maybe I'm missing something.

Note that the data.frame methods for the comparison ("Compare": ==, <, ...) and logic ("Logic": & | and !) operators return a logical matrix instead of a data frame, for convenience and back compatibility.

This is implemented in our operators already

If the members of this group are called as functions, any argument names are removed to ensure that positional matching is always used.

we could do that, but I don't see the point, the positional matching is used by primitives for speed as far as I understand, and operators are almost never invoked as functions.

karoliskoncevicius commented 4 years ago

What I meant is that if somebody writes a custom S3 (or S4 for that matter) object and implements Ops.class <- function(e1, e2) { ... } that it would extend automatically to the operators in this package as well.

Maybe it doesn't make much sense, since the e2 in our case is not an object of a new class.

Yeah let's drop this. Bad idea.

moodymudskipper commented 4 years ago

regex ops must be made consistent as well + tests and then we can close this

moodymudskipper commented 4 years ago

regex ops defined and should behave ok, tests to design still

moodymudskipper commented 4 years ago

closing this as we have an issue about unit tests https://github.com/moodymudskipper/rangeops/issues/7