TimTeaFan / dplyover

Create columns by applying functions to vectors and/or columns in 'dplyr'.
https://timteafan.github.io/dplyover/
Other
60 stars 1 forks source link

Consider splitting "helper functions" into a separate package #6

Open brshallo opened 3 years ago

brshallo commented 3 years ago

For someone new to {dplyover} it may be nice to have documentation / reference page be focused just on funs that are clearly dplyr::across() extensions (so don't get distracted by funs they probably shouldn't worry about when first starting). This seems to be especially true if you are planning on expanding on these helper funs.


This might just be more of a hassle than anything to implement at this point though... Also may want to have be together for the purposes of convenience and seems these funs are only meant to be used in {dplyover} so may not really make sense to have be separate.

...though could make analogy with how {tidyselect} is separate from {dplyr} (but is not a great one because {tidyselect} gets used in a bunch of tidyverse packages and serves a different purpose than the helper funs in {dplyover}). However there does seem to be tidy precedents for having these extensions be modularized into distinct package (e.g. fabletools and fable, or various other "tidy" ecosystems that seem to keep packages very modular.)

TimTeaFan commented 3 years ago

I understand both arguments. At the moment I would leave all helpers functions as part of the package. Prominent (non-tidyverse) packages such as data.table have helper functions like rleid which are often used outside of the data.table context. The question is, when would the {dplyover} helper functions warrant their own package? I'd say, if all (or at least 1. & 3. or 1. & 2.) of the follow conditions are met:

  1. the helper functions are useful enough to be used outside the context of the over-across function family
  2. {dplyover} has a large user base and a substantial part (at least 20%) of the user load {dplyover} just for the helper functions
  3. the helper functions are used by more than one package.

I personally use some of the functions regularly, like dist_values. So I think at least some of the helper functions meet condition 1. But at the moment the package does not have enough users and also no other packages which would rely on its helper functions.