moodymudskipper / inops

Infix Operators for Detection, Subsetting and Replacement
GNU General Public License v3.0
40 stars 0 forks source link

package scope #4

Closed moodymudskipper closed 4 years ago

moodymudskipper commented 4 years ago

@KKPMW said :

What I would like to see in this package:

in[] in{} in(), etc

out[] (or !in[]) variants

variants that work with tables (so replace values that occur some n of times)

variants that help with "cut" so maybe x %#cut% 5 <- letters[5] would cut x into 5 intervals and name them A-E.

karoliskoncevicius commented 4 years ago

This to me seems like the most important issue to being with.

My suggestion would be to start implementing functionality that absolutely will be in the package. Maybe get one function in near-perfect shape (maybe some %in% variant). Then implement others according to that template (other %in% variants and %in%). And after that see if we can find natural expansions to branch out without deviating from the form/syntax.

If you like the names of %in{}% and %in[]%, etc - we can start with them maybe.

This is open to suggestions/comments of course.

moodymudskipper commented 4 years ago

A first try to define the scope of this package :

It aims to provide infix operators that help detect, subset or replace elements of a vector or list.


output type

It means that ideally all functionalities should have 3 counterparts :

It seems like we've decided to use the in suffix for all detect functionalities.

For replacement not in place we can either use `%in[]<-%`(x, range, value) or replace(x, x %in[]% range, value)

could be :

(but better discuss them in naming thread)


action type

we have several collection of operators, with their variants to satisfy the 3 types of outputs :


applied vs atomic

This description doesn't include currently named %in{}% because I don't see actually how it fits yet, or if we should have another set of functions for applied operation

karoliskoncevicius commented 4 years ago

I like the one sentence scope you provided. Thou I would add that they should work on a matrix (or an array) and preserve the dimensions. so that

data.matrix(iris[,-5]) %in[]% c(0,2)

would return a logical matrix, not a vector form like %in% does. This is mainly a selfish need, because all the data I work with is always in a matrix format (not even data.frame...)

I agree with everything you wrote here, except I think I wasn't clear when explaining %in{}%. I tried to expand on that in the "names" issue.

moodymudskipper commented 4 years ago

great then let's make all these functions consstent with matrix lhs, what about data frames ?

could be :

karoliskoncevicius commented 4 years ago

I would vote for supporting data.frames if possible. Only drop this if it's hard to do.

karoliskoncevicius commented 4 years ago

Took a look at what I was doing with infixer - and I think data.frame should be supported. The main reason being that equality operators (>, ==, etc) support data.frame.

moodymudskipper commented 4 years ago

good point, and iris == 3 returns a matrix. I'll take the comparison operators as a reference for consistency. I'm not sure what i'll do with the assignment versions for these cases though but I'll play around with it and then we can discuss it further here.

I think I should be able to implement all the changes we discussed during the week

karoliskoncevicius commented 4 years ago

Yup, seems like iris == 3 returns a matrix, not a data.frame. Maybe to be consistent our operators also should return a matrix in that case?

moodymudskipper commented 4 years ago

yes I believe they should

moodymudskipper commented 4 years ago

We agreed now that the scope is detect (logical output), subset, replace, matches according to equality, inequality, intersection with a range or regex, using a decently generalisable syntax to welcome additional operators if necessary.

These operators return the same type of data (or warning/errors) than equality and comparison operators do, when applied on flat atomic vectors, matrices, lists or data frames, with the difference that our right hand sides have different restrictions depending on the operator. They also treat NA as equality and comparison operators do, i.e. they keep them (unlike %in%).

Replacement operators are wrapper around our detection operators and replace and are named as the assignment form of the detection operators ( e.g. %in{}% and %in{}<-%), and assignment forms to equality and comparison operators are defined as well. (==<- etc)

Additional ideas are to design additional infix operators to wrap our detection operators in :

But all those are considered out of scope for now as it's not clear if they bring enough value.

karoliskoncevicius commented 4 years ago

I tend to agree they do not bring enough value. From all those other variants, I only imagine the subset one brings value. For other instances simply wrapping the result in appropriate function like:

which(x %in{}% c("a", "b"))

seems to be enough.

Let's drop these which all and sum?

moodymudskipper commented 4 years ago

pinning and closing as I think we're good here!