Open alexg9010 opened 2 months ago
yikes. We use pandoc 2. Perhaps something broke in the rmarkdown check for pandoc? I'll take a look.
I just did this and it works fine:
guix shell --container r-minimal r-rmarkdown -- R -e 'rmarkdown::pandoc_available("2.11")'
So, that's not it.
The reason is likely that you're using PiGx from a checkout. I would assume that on the cluster nodes you don't actually have Pandoc. What does the tools
section of the settings file look like? Using PIGX_UNINSTALLED
is also a red flag.
This is the tools section from the generated `config.json', the test settings file does not contain any tool specification:
"tools": {
"Rscript": {
"args": "--vanilla",
"executable": "/gnu/store/b0skxv953fpsdg79cs4g9qz78ds6pvlz-profile/bin/Rscript"
},
"bamCoverage": {
"args": "--normalizeUsing BPM --numberOfProcessors 2",
"executable": "/gnu/store/b0skxv953fpsdg79cs4g9qz78ds6pvlz-profile/bin/bamCoverage"
},
"fastp": {
"args": "--adapter_sequence=AGATCGGAAGAGCACACGTCTGAACTCCAGTCA --adapter_sequence_r2=AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGT",
"executable": "/gnu/store/b0skxv953fpsdg79cs4g9qz78ds6pvlz-profile/bin/fastp"
},
"gunzip": {
"args": "",
"executable": "/gnu/store/b0skxv953fpsdg79cs4g9qz78ds6pvlz-profile/bin/gunzip"
},
"hisat2": {
"args": "--fast",
"executable": "/gnu/store/b0skxv953fpsdg79cs4g9qz78ds6pvlz-profile/bin/hisat2"
},
"hisat2-build": {
"args": "",
"executable": "/gnu/store/b0skxv953fpsdg79cs4g9qz78ds6pvlz-profile/bin/hisat2-build"
},
"megadepth": {
"args": "",
"executable": "/gnu/store/b0skxv953fpsdg79cs4g9qz78ds6pvlz-profile/bin/megadepth"
},
"multiqc": {
"args": "",
"executable": "/gnu/store/b0skxv953fpsdg79cs4g9qz78ds6pvlz-profile/bin/multiqc"
},
"salmon_index": {
"args": "index",
"executable": "/gnu/store/b0skxv953fpsdg79cs4g9qz78ds6pvlz-profile/bin/salmon"
},
"salmon_quant": {
"args": "quant",
"executable": "/gnu/store/b0skxv953fpsdg79cs4g9qz78ds6pvlz-profile/bin/salmon"
},
"samtools": {
"args": "",
"executable": "/gnu/store/b0skxv953fpsdg79cs4g9qz78ds6pvlz-profile/bin/samtools"
},
"sed": {
"args": "",
"executable": "/gnu/store/b0skxv953fpsdg79cs4g9qz78ds6pvlz-profile/bin/sed"
},
"star_index": {
"args": "",
"executable": "/gnu/store/b0skxv953fpsdg79cs4g9qz78ds6pvlz-profile/bin/STAR"
},
"star_map": {
"args": "",
"executable": "/gnu/store/b0skxv953fpsdg79cs4g9qz78ds6pvlz-profile/bin/STAR"
}
}
}
Seems like the way to set pandoc path is done via rmarkdown::find_pandoc
(see https://bookdown.org/yihui/rmarkdown-cookbook/install-pandoc.html)
The purpose of find_pandoc()
is to
Searches for the pandoc executable in a few places and use the highest version found, unless a specific version is requested. Source: https://pkgs.rstudio.com/rmarkdown/reference/find_pandoc.html
Specifcally it searches the paths given by "RSTUDIO_PANDOC", "PATH" (via rmarkdown:::find_program() ) and the folder "~/opt/pandoc"
:
There is no "~/opt/pandoc"
, but exporting "RSTUDIO_PANDOC" via qsub is possible by updating the qsub template:
qsub-template.sh.in
:
#!@GNUBASH@
# properties = {properties}
if [ 'yes' = '@capture_environment@' ]; then
export R_LIBS_SITE="@R_LIBS_SITE@"
export PYTHONPATH="@PYTHONPATH@"
export RSTUDIO_PANDOC="@PANDOC@"
fi
env
{exec_job}
checking for used pandoc version by adding this chunk to rule report1
:
{RSCRIPT_EXEC} -e 'rmarkdown::find_pandoc()'
We can inspect the jobs environment by checking the job log output:
$less tests/output/snakejob.report1.40.sh.o7043287
[...]
RSTUDIO_PANDOC=/gnu/store/b0skxv953fpsdg79cs4g9qz78ds6pvlz-profile/bin/pandoc
[...]
$version
[1] ‘0’
$dir
NULL
So it seems no matching dir was found.
Running the function find_pandoc in guix environment -l guix.scm
in the pigx folder works:
> rmarkdown::find_pandoc()
sh: warning: setlocale: LC_ALL: cannot change locale (en_US.utf-8)
$version
[1] '2.19.2'
$dir
[1] "/gnu/store/b0skxv953fpsdg79cs4g9qz78ds6pvlz-profile/bin"
My reading of pandoc.R tells me that RSTUDIO_PANDOC is meant to be a directory. Give it the dirname of @PANDOC@ instead.
Thanks, using dirname of pandoc works.
I will try to fix this in pigx-common.
I was running the test data in a cluster environment.
I had to extend the memory limit for counts_from_SALMON in tests/settings.yaml:
Then run via
The pipeline failed for the report generating jobs:
This is the content of the log:
I see this pandoc related error: