atorus-research / Tplyr

https://atorus-research.github.io/Tplyr/
Other
95 stars 16 forks source link

Inconsistent behavior between set_denoms_by() and set_denoms_where() outside of layer context #147

Closed mstackhouse closed 8 months ago

mstackhouse commented 10 months ago

Prerequisites

For more information, see the CONTRIBUTING guide.

Description

set_denom_where() will apply on a tplyr_table object without issue, but set_denoms_by() will error because there's no method for a tplyr_table() object.

set_denom_where() is also NOT an S3 method, but set_denoms_by() is.

Side note - I don't love that one method has denoms and the other is denom...

Steps to Reproduce (Bug Report Only)

load(test_path('adsl.Rdata'))

tplyr_table(adsl, TRT01P) %>% 
  set_denoms_by(TRT01P) %>% 
  add_layer(
    group_count(RACE)
  ) 

tplyr_table(adsl, TRT01P) %>% 
  set_denom_where(SEX == "F") %>% 
  add_layer(
    group_count(RACE)
  ) 

Expected behavior: [What you expected to happen]

Consistent error or success of the method application

Actual behavior: [What actually happened]

set_denoms_by() errors, but set_denom_where() succeeds.

Versions

Current devel branch

ShiyuC commented 8 months ago

@mstackhouse The first chunk of code would work if set_denoms_by() is used within add_layer(), i.e.

tplyr_table(adsl, TRT01P) %>% 
  add_layer(
    group_count(RACE) %>%
      set_denoms_by(TRT01P)  
  )

Correct me if I'm wrong, but based on my understanding, the "by_" variable needs to be passed from "group_count(by = xxx)" to set_denoms_by, so they need to be performed within add_layer(). If that's the case, then the design for set_denoms_by and set_denom_where seem to be different. Could you provide more guidance on how to make them consistent? Do we want to use S3 method for set_denom_where() as well and make it only work within layer context?

mstackhouse commented 8 months ago

@ShiyuC now I remember that I was vacillating one this but you make a perfect point about the by_ variable within the layer. And I also forgot that there was a separate method for shift layers.

So for the issue itself, let's keep the behavior the same - but let's add type checking into set_denom_where() to make sure the object is a tplyr_table or a tplyr_layer. Also, since I'm looking at this - can we find anywhere that's using vars() inside the package and replace it with quos()? dplyr::vars is superseded and the function itself is literally:

function (...) 
{
    quos(...)
}
<bytecode: 0x560da4220320>
<environment: namespace:dplyr>

So we should just use quos() internally instead.

ShiyuC commented 8 months ago

@mstackhouse the type check is added within set_denom_where(). Please review https://github.com/atorus-research/Tplyr/pull/162 when you get a chance.

For the vars() replacement, it caused most of the tests in the package to fail. e.g. image

As this doesn't seem to be a simple "find and replace" issue, I'm going to create a separate PR/branch for it and investigate the errors