Problem with dependencies and interactive data selection

thibautjombart commented 2 years ago

Actions running the analyses seem to encounter problems causing the jobs to fail. Digging it down, it seems to have been introduced by a new report ELR_evaluation and specifically this line: https://github.com/whocov/trend_analysis_public/blob/main/report_sources/ELR_evaluation.Rmd#L45

Also note that for some reason the html version of the report has been committed and pushed to github.

Cause of the issue

reportfactory::list_deps() currently runs the Rmd to isolate dependencies actually needed, as opposed to deps which may for instance be mentioned in comments or in the text. The issue is that file.choose cannot be used, by definition, for automation, and so the Rmd does not compile and this causes errors further down the line.

Proposed way forward

on the reportfactory side, we will do things more simply, so that a failing Rmd will not prevent the detection of deps; more on this when Tim T will have made progress
I would strongly recommend avoiding interactivity in anything that is meant for automation, as both are incompatible; if needed, one could always keep the interactive file input as an option triggered by a parameter e.g.

if (params$interactive) {
# use file.choose() ...
} else {
# use default input
}

in which case that parameter can default to FALSE in the yaml.

Important note on reportfactory

The package's github is moving to reconverse, so to install the devel version you'll need to use reconverse/reportfactory and not reconhub/reportfactory.

I would recommend using the CRAN version as soon as the next release has gone through - target date for submission is some time next week.

nsbatra commented 2 years ago

@thibautjombart Thanks for investigating this and sorry for the headache. The intent with the ELR_evaluation.Rmd was that this not be run automatically. The file.choose is so that the dataset does not end up on Github. I didn't realize that just its presence would mess up automation of other scripts.

I will try to implement the parameter solution you suggested. Thanks.

Regarding the HTML - I had tried to gitignore the html output, but for some reason that did not work - can you suggest what is wrong?

thibautjombart commented 2 years ago

Nope it looks fine to me. My guess would be it has been 'git add --force'ed by mistake?

thibautjombart commented 2 years ago

Hehe in fairness, I would have not suspected it would mess other part of the infrastructure either ;)

thibautjombart commented 2 years ago

Update

The reportfactory update is now getting on CRAN thanks to Tim's fix: https://cran.r-project.org/web/packages/reportfactory/index.html

This will fix the issue with list_deps(). Note that the former behaviour is still available using: list_deps(parse_first = TRUE)

With this option, the function still errors due to the syntax error in pin_plot: https://github.com/whocov/trend_analysis_public/blob/main/report_sources/ELR_evaluation.Rmd#L210

In terms of design, there had been an implicit assumption so far in the reportfactory that all Rmd in report sources were functional (i.e. compile successfully), using other branches in a VCS for devel versions. But this should no longer be a requirement from version 0.4.0 onwards.

nsbatra commented 2 years ago

Thanks @thibautjombart and @TimTaylor for your quick actions!

Flagging @henryls1 for awareness. Today we were able to compile the necessary reports/assemble data files locally.

I think this design shift is a good one, broadly speaking. Although having branches in a VCS may be best-practice, many people who would benefit from reportfactory will not be comfortable using a VCS and will have Rmds that are not functional. I think this move will make the tool more appealing to a broader audience.

TimTaylor commented 2 years ago

Cool - yep, my initial implementation was a little too picky on dependencies ;) . For info the new version of report factory is 0.4.0. Source packages are now on CRAN but the binaries will take a few days to arrive.

thibautjombart commented 2 years ago

I have made a PR to close this: https://github.com/whocov/trend_analysis_public/pull/39

Should fix the automation. Tagging @nsbatra @henryls1 to process / merge the PR.

thibautjombart commented 2 years ago

So the reportfactory issue is fixed now but there are other issues. It seems the package whomapper is used in the new report but is not on CRAN. If this is a package available from a remote, it needs to be installed prior to running install_deps() in the actions.

nsbatra commented 2 years ago

@thibautjombart ... I can just remove this RMD from the repo and store it elsewhere. Unless (let me know) you find it useful to keep troubleshooting these issues to improve reportfactory.

Maybe another solution would be to add an argument to install_deps() to ignore certain RMD?

thibautjombart commented 2 years ago

Yes I think it is a good idea, if only to double check that there is nothing else going on. We can always re-add it later if it needs automation. As a general rule I would think only things that need automation at the same pace as the rest of the infrastructure should be added.

nsbatra commented 2 years ago

I've removed the ELR evaluation report from the report_sources folder, but jobs still are failing. I'll try one more time, but perhaps there is something else going on.

thibautjombart commented 2 years ago

Thanks. There are a few spurious files in report_sources including a dataset which I'll remove but I can't see this being an issue. The output of list_deps() seems fine to me, so I am not sure where this comes from. Got to run some errands all day today (and pretty much the rest of the week) but I'll try and take a look asap. Has anyone made big changes since last time the infrastructure was working? It would make sense to try and revert back to that state to see what happens.

henryls1 commented 2 years ago

Hi both - some colleagues had a similar issue with GITHUB_PATs so i reached out for their solution. As a test I implemented it in auto_synthesis.yml and tested it - seems to be working well. See the following action and the commit that fixed the issue.

I'm going to go ahead and make the same change to the other .yml files

TimTaylor commented 2 years ago

Awesome - great spot!. Yes - I think something has changed with how authentication has been working on GitHub so you now need an explicit token specified it seems. FI - I think you could actually use GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }} as, from my limited understanding, the PHIFUNC deals with the private repos. @thibautjombart can probably confirm.

thibautjombart commented 2 years ago

Awesome Henry! The PAT was indeed the last thing I was thinking of trying before resorting to dark rituals involving chickens at a moonless night. So, glad it worked! :)

I am not sure why the github action is no longer using the dedicated token by default (no point in having a default token if it's not used by default!), but it is probably safer to go with your solution and use mine. Personalised token from users with admin rights have additional priviledges like the ability to push to other branches etc.

For the record: we basically had 2 independent problems, one of which was out of our hands, happening at the same time. That's quite unfortunate, but looking ahead, to preserve automation, maybe a good course of action would be:

check that new PR / commits with substantial changes don't break anything by manually triggering a job (e.g. auto-update-all)
if broken, revert to previous commits, check that actions are working again; if they're not, that's probably something to do with the way github actions work; fix that first and then try the new commits again

There is probably a way to automate this with other actions, as this is basically continuous integration. Such tests should ideally be triggered at the PR stage. I can look into that kind of improvement when the dust settles on my end.

whocov / trend_analysis_public