phuse-org / valtools

Validation framework for R packages used in clinical research and drug development.
https://phuse-org.github.io/valtools/
Other
51 stars 10 forks source link

File space design choices #215

Open wlandau opened 2 years ago

wlandau commented 2 years ago

I enjoyed learning more about valtools in today's R/Pharma workshop. I have a better sense of the intent and design of the package. I like how it standardizes the final report, nudges the user to confront requirements and risk assessments, and explicitly links requirements to test cases.

However, I did feel friction when it came to setting up validation workspaces. valtools creates a unique, complicated, multi-level directory structure of files, many of which need to be edited by hand. Because of this and #214, I found myself constantly attending to a mental model of the file space, which competed in my brain for the attention I was trying to pay to the iterative process of writing requirements and rendering the final report.

What considerations led to the current design choices? Besides the current helpers for creating files, are there other ways to increase encapsulation and help users like me trust the current abstraction?

IMHO, R Markdown really excels when it simplifies file system. fusen and R Markdown driven development are the best examples I know. And I created Target Markdown so users of targets could just write code chunks instead of remembering paths to script files.

thebioengineer commented 2 years ago

A big part of the design was from the natural process when we were doing the validation framework by hand (pre-valtools). We thought about it when we were writing up the white paper and writing valtools to organize it so all the validation content could organized within a single folder:

Then all this content could be socked away within the vignettes folder, and the validation report Rmd could live within the vignettes folder so it would get rendered when the package got built. The added benefit is that then when the package is compiled, we can copy the validation folder into inst when the package is built to retain the availability of the validation content once the package is installed.

Admittedly, we made a number of assumptions, but we tried to give informative errors/nudges when something invalidated our assumptions. #214 is a weird case which is usually suggested to avoid - having multiple nested projects.

Maybe we could have a meeting and go over how we could simplify some of these things.

wlandau commented 2 years ago

Thanks, for explaining, Ellis. Makes sense that you all organically found what works and then codified it. Happy to chat further.

thebioengineer commented 2 years ago

For sure. Apparently we need to better explain the process and formatting intent. Obviously it made sense to us, but if it doesn't translate, its not helpful.

Also, I wonder if #214 made this more confusing

wlandau commented 2 years ago

Yeah, could be a combination of #214 and the fact that I am a new user.

Is the white paper public? I have not been able to find it. Apologies if I am missing something obvious.

I just saw the cheat sheet, and I found it extremely helpful for this issue, particularly the "validation elements" file tree on page 1 and the top-to-bottom flow diagrams on page 2. I think featuring those specific visual aids more prominently in the other documentation would go a long way. Maybe that and #214 is all that is needed to solve this issue.

Beyond that, I will just plant a seed. (This is just my impression, I understand if it is beyond scope.)

valtools invents a lot of its own structure, and you all have clearly thought deeply about what that structure should be. The output files are extremely well organized and modular, and with practice, it is not a difficult system to become familiar with.

On the other hand, the more complicated and unique valtools output is, the greater the learning curve. By borrowing from existing conventions around packages and/or R Markdown, maybe the system could be simpler to learn without sacrificing payoff. A couple thoughts:

jwildfire commented 2 years ago
  • Test code in tests/testthat/, or maybe somewhere else in tests/. This location seems idiomatic for R packages, and unless test cases are computationally expensive (maybe that's why you separated them out?) I think they would make great additions to the package's ordinary test suite.

I found this discussion helpful, and just wanted to flag this comment in particular. My team is starting to use valtools now and I find the framework to be very helpful, but am already considering whether parts of the workflow could be standardized to behave more like the standard package development workflow. Having tests lives a few folders down in the vingettes folder seems especially non-standard.