Open seth127 opened 2 years ago
Work-in-progress Requirements:
inst/validation
and load to spec
objectmrgvalprep::get_sys_info()
, passing COMMIT_HASH
and (if present) METWORX_VERSION
auto_test_dir
with csv and json for previously run testsstyle_dir
, but default to NULL
."{name}-{version}-validation-docs"
I started on implementing this and I'm looking for some feedback here @kyleam @barrettk. (Let's keep it on this issue instead of the PRs for now, saving PR comments for actual code review, if it comes to that.)
I implemented a helper extract_r_package_info()
that does the first five bullets from the previous comment, but doesn't run the tests or create the docs. There is some added complication to those steps, discussed below. I also went ahead and created validate_r_package()
in a separate branch, which does everything listed in the previous comment. I'm not sure if that's a good idea or not though.
extract_r_package_info()
Draft of this in #49.
I think this is pretty good as is, though I'm certainly open to feedback on both design and implementation. TL;DR you pass it a path to a local checkout of the package repo and it does the following and gives you the results a named list:
DESCRIPTION
fileDESCRIPTION
fileinst/validation
and load to spec
objectNEWS.md
matching name and versionIt gets rid of a fair amount of annoying boilerplate code that would need to be added to a script like this. That said, when I tried to refactor that script to use this helper, it was still a bit long and annoying:
Which brings us to...
validate_r_package()
A draft of this is in #50. I have very mixed feelings about this function. On the bright side, it makes that entire script above turn into:
mrgvalprep::validate_r_package(
here::here(),
style_dir = "~/Documents/docx-ref-header-image/"
)
So that's pretty slick. Here are several concerns though:
mrgvalidate
instead of mrgvalprep
mrgvalidate::create_package_docs()
(which is only present in 2.0.0) and actually creates validation docs.mrgvalprep
.mrgvalidate
is already in Suggests for mrgvalprep
, but only for used in vignettes. This is the first time it's called in an actual function.run_tests()
that did a bunch of gymnastics to control the test environment, but this was in large part because it was cloning a repo and installing it (along with dependencies) to a temp dir.validate_r_package()
being called from a session with all of the relevant package's dependencies available (i.e. on .libPaths()
) but I think that's a fine assumption/restriction. This will pretty much only be called from within the package project or in Drone, and either case should be fine.tests_dir
as an arg with a default value and also let you pass through auto_test_dir
with pre-run test results....
through to mrgvalidate::create_package_docs()
the right move? Maybe, but we should probably at least check them.Thanks in advance for the feedback here. Again, let's keep the discussion on this issue (instead of the PRs) until we decide on a path forward.
I was still hopeful that we could use the r-dev-ci-mrgval
image (defined in our r-snapshots/docker-r-dev-ci
GHE repo) for most packages (example scratch build with old mrgvalidate). Aside from of course needing an update for mrgvalidate 2.0, that image currently handles only packages whose tests run completely on the r-dev-ci-mpn
images (so notably not bbr), but IIRC the idea was that, for these non-Drone cases (hopefully only a few), we could adjust it to take a specified auto_test
directory so that the manual test runner could feed that as input to a docker run ...
[1].
[edit: "manual test runner" is bad word choice in this context. I mean the human that's running the automatic tests.]
In that context, I think the value of a full end-to-end runner like validate_r_package
goes down. The most valuable helper would be one for generating the test results directory because that's what packages like bbr need to handle outside of the image. It's still worth considering whether some other helpers (things like parse_latest_release
) should live mrgvalprep (where they can be nicely documented and tested), even if the r-dev-ci-mrgval
image ends up being their only user, but then these helpers should be designed with r-dev-ci-mrgval
in mind.
So, I suppose my thoughts on this issue depend on whether you see r-dev-ci-mrgval
as the primary way of generating the docs for most packages.
[1] To run these images on Metworx, we need to be able to pull from ECR. It doesn't look like that's accessible from workflows by default. So, that'd need to be looked into. (Accessing ECR from laptop while on VPN works fine.)
for these non-Drone cases (hopefully only a few), we could adjust it to take a specified auto_test directory so that the manual test runner could feed that as input to a docker run ... [1].
A half-baked idea that gets around the need for the manual tester to call docker run
at all: have a helper script that injects the untracked test results into a disconnected branch (similar to gh-pages) and pushes that. On push of that branch, Drone could be triggered to build the val docs and publish them like it does for the standard case. That approach would mean the the test results (including the tests.json
) would be public, but I think we'd be okay with that.
@kyleam this is all a very good point. I do still think that running in drone (on tag, similar to how we build and publish) is probably the best scenario. And, in that case, I think you're right that the only actually useful thing here, even in extract_r_package_info()
, is parse_latest_release()
.
A half-baked idea... have a helper script that injects the untracked test results into a disconnected branch (similar to gh-pages) and pushes that.
I like this idea. I think we found that there are actually a non-trivial number of cases where we need to run (some subset of) the tests on Metworx. I remember a few of the plotting packages... probably mrgsolve
...
Let's remember to talk about this at some point soon: basically what would be the interface for a user running the tests locally (on Metworx) and then pushing them. Some questions:
mrgvalprep
function? Would that be better or worse than a boilerplate .sh
script that gets checked in to various package repos?
@seth127:
basically what would be the interface for a user running the tests locally (on Metworx) and then pushing them. Some questions:
Good questions. Taking the second one first:
Could this be an
mrgvalprep
function? Would that be better or worse than a boilerplate.sh
script that gets checked in to various package repos?
Hmm, I hadn't thought much yet about the actual location of this logic. (I think I said "helper script" just because I had subdir-to-gh-pages
in my head as an example of the "import to subtree on other ref" functionality.) I'd say mrgvalprep
makes sense because the test runner would already be using it for running and writing the tests results. We could extract the logic you have in validate_r_package
to a "run and write tests" function, and then have a wrapper on top of that that writes the results to a Git reference and pushes (more on that next).
Would the tests need to be run on the same commit hash that is being tagged/validated? That seems obnoxious and brittle, but if not... how do we know that we tested the same thing that we're validating?
Yes, I think there ought to be an automatic link between the tester's commit (or tree, really) and the results that Drone uses. Here's a sketch:
human is ready to tag a commit for a release and runs mrgvalprep::run_tests_and_push_results()
(needs better name).
This function should be strict and abort if there are any untracked changes.
Assuming the tests pass, that injects the results and info into a commit on the refs/mrgval/test-results
reference. (Note that this is outside the standard branch namespace; I can expand on why I think that is appealing in this scenario.)
With that reference, the results are put into a subdirectory named by the tree ID (likely actually a pair of subdirectories: {treeid[0:2]}/{treeid[2:]}
).
Note that using the tree ID focuses on what we actually care about. If human tests on the top of the release branch but then ends up tagging a merge commit that has the exactly same tree (i.e. it was a fast-forward merge), that's fine.
The function handles pushing that ref. After incorporating any remote changes, the only possible conflict is from that exact tree having results already, so that should be propagated that up to the caller.
human creates and pushes the release (or dev release) tag
The repo's Drone build has a tag-triggered pipeline.
One step derives the tree ID (from dereferencing the tag/commit), fetches refs/mrgval/test-results
, and extracts the test results for the tree ID. If the results aren't present but the tag is a dev release, there could be some logic to exit the pipeline without failure.
I'm presenting this as a separate step conceptually, but this logic should probably be within r-dev-ci-mrgval
(not necessarilly as part of the main script), so that different repos don't need to duplicate non-trivial logic in their .drone.yml
s.
The next step runs r-dev-ci-mrgval
, feeding it the test results extracted in the previous step.
The final step publishes the docs.
Note that, when comparing this to the Drone pipeline for the fully automatic val doc generation, only the first step differs.
A draft implementation of the git ref idea is at gh-52.
Given new changes in
mrgvalidate 2.0.0
, it will be nice to have a helper function that you can point at an R package and it will extract all of the relevant information from the DESCRIPTION, NEWS.md, tests/testthat/ dir, etc. and feed them tomrgvalidate::create_package_docs()
.Considerations
This issue needs more detail before we start on it, but here are some thoughts off the top:
mrgvalidate::create_package_docs()
call? I think so, but we may need to optionally be able to pass through a few things (e.g. see next bullet).