Open rsangole opened 6 years ago
Ok, I'll do this in steps, it appears I can't write this proposal all in one go..
A large part of the R community, among which I suspect a large part of the ProjectTemplate users, are migrating to the tidyverse
packages by Hadley Wickham. A major advantage of these packages is their uniform interface to functions (like the convention to always take data as a first argument) in combination with the high performance.
A few of the major packages in the tidyverse
that can be useful in the internals of ProjectTemplate are:
dplyr
: famously well at manipulating data.frames with clear verbs as function names (mutate
, filter
, etc.)purrr
: replacements for the apply
family of functions for looping over vectorstibble
: enhanced print functions for data.frames, and a replacement function data_frame
which also handles list-type columns better.readr
: High performace import functions for several file typesreadxl
: High performance import function for Excel files without external dependenciesBecause inside several functions there is actually some data manipulation, and of course we import several types of files. There are some caveats though:
tidyverse
, some default variable names will become inconsistent between file types if we don't take some precautions.Thanks for getting this kicked off. As I think of things, I'll append too.
One thing that I'd like to consider is using assertive
as the pipe-able tool of choice to writing robust unit tests. If you haven't used it before, give it a go. It's a very expressive and tidyverse
friendly package.
@rsangole I haven't used assertive
yet, but I would say that changing the tests is something to be done with much care. We should NOT convert library functions to tidyverse
while at the same time changing the tests. If assertive
is a great improvement over testthat
then we should either:
tidyverse
and then prove using the old tests that nothing has changed we don't want to change, and only then change the tests (again by building redundant tests and then removing the old tests).I think that option 2 is the best if we don't want interface changes. If we do accept interface changes we might also first convert the tests and then change the library code.
Does assertive
integrate with the RStudio build tools and devtools::test()
by the way?
We would also need to consider if assertive
will work with travis
.
Agree with both @Hugovdberg and @KentonWhite . We should not try and 'fix' what currently works. I was of the opinion that:
tidyverse
, if the existing tests work, do not change the test codes at all.assertive
as a method to write any new tests we write for new features/bug removal etc, as assertive
is more programmer friendly (like other tidyverse
tools).assertive
works with requirements of devtools
or travis
, use assertive
as an alternate way of writing tests.Does assertive integrate with the RStudio build tools and
devtools::test()
by the way? We would also need to consider ifassertive
will work withtravis
.
I don't know answers to either of these questions. I can read up and check.
Any ETA on how far off this is from realization? I've had to rewrite a bunch of code because the field names being imported are now with a dot between them, which may not be a bad option, but would be nice to import without doing this to avoid a lot of code rewrite. I don't have a huge amount of extra time but if there's anything I can do to help, let me know.
A ways off. There are issues with backwards compatibility. Not every project is tidyverse compliate. Bioconductor, which represents a large user base from the biological sciences, are not tidyverse compliant. Moving everything over to tidyverse causes problems for those users.
As colleagues and I had issues with stability of code written using tidyverse packages (functions were not backward compatible and thus not working after one or two years with new package versions) we now avoid using tidyverse packages wherever possible. data.table delivers most functionalities we need for data analysis and is very fast as well. Therefore I would be really unhappy if the use of ProjectTemplate would involve more tidyverse software.
@Hugovdberg - making a space for you to document your thoughts on this. I recall you mentioned this in #220 .