moodymudskipper / safejoin

Wrappers around dplyr functions to join safely using various checks
GNU General Public License v3.0
42 stars 7 forks source link

Explicit check parameter ? #43

Closed moodymudskipper closed 2 years ago

moodymudskipper commented 3 years ago

I've been uncomfortable with the check parameter from the start, it was tough to decide between a long list of parameters or something compact. I chose the former to keep a simple interface and to be able to easily store a configuration, but I was naive thinking that after some use we'd remember the meaning of these letters. But maybe we can have the best of both worlds ?

What about an explicit function sj_check(). It would allow to build a config explicitly, so no need to go back to the doc to see what we meant by check = "~blC".

Instead of having the result of sj_check() output these cryptic strings, it would output a vector with a class and a printing method. This vector would contain items corresponding to all the b c u v ... options but named explicitly, and their values would be either NA (default), "inform", "warn" or "abort" (that we could abbreviate thanks to match.arg()).

example :

safe_inner_join(..., check = sj_check(implicit_by = "abort", inconsistent_factor_levels = "inform"))
# or 
check <- sj_check(implicit_by = "abort", inconsistent_factor_levels = "inform")
safe_inner_join(..., check =check)

The old way of doing it would still work and would be mentioned as the short form, but the main form would be the above and would be advertised first.

It would have its own help page too, uncluttering the main one.

Ljupch0 commented 3 years ago

Looks good and much more readable when you come back to it.

moodymudskipper commented 2 years ago

@Ljupch0 It's implemented in {powerjoin} https://github.com/moodymudskipper/powerjoin. As a user of {safejoin} your feedback and suggestions on the new package would be extremely helpful so don't hesitate to drop by the issues!