[ENH] Reconfiguring exports

paulhibbing commented 2 years ago

In my workflows, I often want to do multiple things with raw data, some that might involve conversion to counts, and others that don't. I plan to tweak the code in a way that allows this, but thought I'd mention it here as well for discussion. If it sounds favorable, I'd be happy to submit another PR.

My plan is to break up the wrapper function in a way that allows separation of reading, converting, and saving. This could be done with no change to the current usage:

get_counts <- function(args) { read_file() %>% calculate_counts() %>% return_output() }

But a little more perspicuous naming could also be warranted:

count_wrapper <- function(args) { read_file() %>% get_counts() %>% return_output() }

I don't love the function name count_wrapper, so I'm not sure exactly what the best names would be. And maybe the first option is better anyway, since it won't interfere with the operation of prior code. Just something I'm thinking about and throwing out for discussion.

I also noticed that the dot functions (e.g., .bpf_filter) are exported, which I would probably change in my PR as well. This doesn't cause any issues, but it would clean things up a bit. They could be re-exported in the future if it becomes clear they have multipurpose applications. But in that case, it would probably be best to remove the dot prefix, since that usually implies hiddenness.

bhelsel commented 2 years ago

I think that is an interesting idea. I'm not sure I can picture exactly what you are thinking in regards to breaking it up... So you'd essentially be adding exported read_file, calculate_counts, and return_output functions? And this would allow you, for example, do something to the raw file before calculating counts or do something to the counts file before returning or saving it?

I don't love the "count_wrapper" name, so I'd think the first option would be better unless we can come up with a different name. The name get_counts() is similar to the Python wrapper in the package released with the preprint article, so I see an advantage of keeping that as the wrapper name. They use "extract" where you use "calculate" and I don't really have a preference.

Exporting the .functions (e.g., .bpf_filter) was done primarily because sometimes I like to see the function with View(agcounts::.bpf_filter), but didn't want them to be included in the documentation. The "." was originally for hiding them, but I found that including @noRd in the documentation kept them out of the man files when running the roxygen::roxygenize() function. What do you think about just removing the "." for those functions and keeping them as exported options so those using the package can still view the function with View()?

paulhibbing commented 2 years ago

You can still access un-exported functions using three colons, e.g. View(agcounts:::.bpf_filter), so I'd recommend just keeping them internal.

Documentation is also optional for things that aren't exported, so if you're not wanting to include the roxygen stuff in the package anyway, you could consider deleting all the tags etc for hidden functions. That said, it can be useful to other GitHub collaborators (not to mention your future self) to have the descriptions of what's going on, and you've already put the effort into writing it, so maybe it'd be better to keep it around and just add @keywords internal.

bhelsel commented 2 years ago

I didn't know about using the three colons to view the un-exported functions. That's great to know! I'm learning so much from you, Paul. I think that sounds like a good plan. Let's move the those functions to internal, but keep the documentation since we already have it.

bhelsel / agcounts

[ENH] Reconfiguring exports #5