moodymudskipper / safejoin

Wrappers around dplyr functions to join safely using various checks
GNU General Public License v3.0
42 stars 7 forks source link

best join ? #28

Closed moodymudskipper closed 5 years ago

moodymudskipper commented 5 years ago

A fuzzy join variable where we keep only the minimal value given by the by formula.

In theory we could automatically switch to "best join" if the result of the formula is numeric and not logical, but all of this is handled by fuzzy_join function so we can't intervene at this step.

We can do the cartesian product ourself but that's pretty much recoding fuzzy_join, which might not be a bad idea.

After this is done we still have the ambiguity of "best" join, best for grouping variables on the left/right ?

As it is a form of aggregation, it might fit better in eat, and then it is (I think) intuitive that the table on the left is the one we're grouping on to get these best values.

moodymudskipper commented 5 years ago

will be part of https://github.com/moodymudskipper/safejoin/issues/33