cis-ds / Discussion

Public discussion
10 stars 15 forks source link

ordering commands in filter #177

Closed sizhenf closed 3 years ago

sizhenf commented 3 years ago

Hi everyone,

I have a question about whether the order of the commands in filter matters. When there are two commands in filter, is R running the first one first and the second one second, or running them at the same time?

An example:

case_vote %>%
  group_by(justiceName) %>%
  filter(declarationUncon = 2 | 3 | 4,
         n() >= 30) 

In this case, I'd like to filter declarationUncon = 2 | 3 | 4 first and then keep n() >= 30. Will the program run the commands in order as I wish?

Thanks, Serena

bensoltoff commented 3 years ago

It sounds like you want an AND operator. That is, only keep observations with a declarationUncon value of 2, 3, or 4, AND observations with justiceName that appears at least 30 times.

You could split it into two distinct filter() operations which may make things a bit easier to understand. However, if you want a single function then you need to ensure the OR operator applies to each of declarationUncon possible values. Your syntax right now does not conform to R standards. Review the reading for filter operations and logical operators. You have to repeat the declarationUncon test three separate times, one for each value

filter(declarationUncon == 2 | declarationUncon == 3 | declarationUncon == 4)

or use the appropriate operator to check that declarationUncon is at least one of those values

filter(declarationUncon %in% c(2, 3, 4))

Once you fix that, you can combine it with the second test

# group the first set of operations together using parentheses
filter((declarationUncon == 2 | declarationUncon == 3 | declarationUncon == 4), n() >= 30)

# not necessary with the %in% operator
filter(declarationUncon %in% c(2, 3, 4), n() >= 30)
sizhenf commented 3 years ago

Got it! Thank you very much!