easystats / datawizard

Magic potions to clean and transform your data 🧙
https://easystats.github.io/datawizard/
Other
212 stars 16 forks source link

degroup() for cross-classified data #521

Closed strengejacke closed 3 months ago

strengejacke commented 3 months ago

Fixes #520

strengejacke commented 3 months ago

Tagging @jmgirard, you may review this PR.

strengejacke commented 3 months ago

@etiennebacher Currently, when a variable name was misspelled, we simply ignore it and print a message (not even a warning!). Should we error here?

library(datawizard)
data(efc, package = "datawizard")
degroup(
  efc,
  select = c("c12hour", "neg_c_8"),
  by = c("e42dep", "c173code"),
  suffix_demean = "_within"
) |> head()
#> Variables "neg_c_8" and "c173code" were not found in the dataset.
#>   Did you mean one of "neg_c_7" or "c172code"?
#>   c12hour_between c12hour_within
#> 1         52.7500      -36.75000
#> 2         52.7500       95.25000
#> 3         52.7500       17.25000
#> 4              NA             NA
#> 5        106.9683       61.03175
#> 6        106.9683      -90.96825

Created on 2024-06-27 with reprex v2.1.0

strengejacke commented 3 months ago

Ok, the implementation of nested designs seems to be wrong (see https://github.com/easystats/datawizard/issues/520#issuecomment-2195085052). I would remove that code and wait until we know how to do this correctly.

strengejacke commented 3 months ago

I think this is ready to be merged?

etiennebacher commented 3 months ago

There is still some discussion going on in #520, does this PR fix everything? If not, you might want to remove the "fixes #520" in the first post.

I just want to do a second pass this afternoon or tomorrow since there were many commits since the last review.

strengejacke commented 3 months ago

It works for the cross-classified case, as requested in #520. The extension to nested data structures came to my mind, but seems to be less trivial. I think this can be done later. The docs clearly state that degroup() works for the cross-classified, but not nested case right now.

strengejacke commented 3 months ago

I just want to do a second pass this afternoon or tomorrow since there were many commits since the last review.

That was an attempt to see if nested structures would work, but I then reverted the changed. So it's actually less new code you have to review ;-)