vincentarelbundock / modelsummary

Beautiful and customizable model summaries in R.
http://modelsummary.com
Other
911 stars 75 forks source link

`datasummary_balance`: provide an error if LHS of formula is not empty? #751

Closed etiennebacher closed 5 months ago

etiennebacher commented 5 months ago

I wanted to apply datasummary_balance() on a subset of variables only, e.g datasummary_balance(mpg ~ am, mtcars), but with much larger data where this took a while. I only realized afterwards that the formula argument must be one sided and that it would compute the difference in means for all variables, no matter what I specify in the LHS of the formula.

Is it possible to have an error/warning when the LHS of the formula is not empty? Alternatively, it would be nice to be able to specify only some variables in the LHS so that I don't need to add an additional select() before making the table.

vincentarelbundock commented 5 months ago

I like your idea of allowing LHS variables. It should be trivial to implement column subsetting at the top of the datasummary_balance() function definition.

etiennebacher commented 5 months ago

I think it would also make sense to put it in sanitize_datasummary_balance_data().

FYI, I'm not familiar with the modelsummary codebase and all the tables functions (All(), etc.) so I probably won't propose a PR (or at least not soon)

vincentarelbundock commented 5 months ago

You can now subset with the LHS of the formula. This is a cool little feature, I think.

https://github.com/vincentarelbundock/modelsummary/pull/755