BIMSBbioinfo / pigx_rnaseq

Bulk RNA-seq Data Processing, Quality Control, and Downstream Analysis Pipeline
GNU General Public License v3.0
20 stars 11 forks source link

Local build of pipeline fails to run on cluster #141

Open alexg9010 opened 2 months ago

alexg9010 commented 2 months ago

On the cluster enter guix environment and build pipeline

guix environment -l guix.scm
./bootstrap.sh ; ./configure

summary output

$ guix environment -l guix.scm
$ ./bootstrap.sh ; ./configure
autoreconf: export WARNINGS=
autoreconf: Entering directory '.'
autoreconf: configure.ac: not using Gettext
autoreconf: running: aclocal --force -I m4
aclocal: warning: couldn't open directory 'm4': No such file or directory
autoreconf: configure.ac: tracing
autoreconf: configure.ac: not using Libtool
autoreconf: configure.ac: not using Intltool
autoreconf: configure.ac: not using Gtkdoc
autoreconf: running: /gnu/store/635125h46k79sjzlxy4axnkvf4q3fhfd-autoconf-2.71/bin/autoconf --force
autoreconf: configure.ac: not using Autoheader
autoreconf: running: automake --add-missing --copy --force-missing
autoreconf: Leaving directory '.'
checking for a BSD-compatible install... /gnu/store/b0skxv953fpsdg79cs4g9qz78ds6pvlz-profile/bin/install -c
checking whether build environment is sane... yes
checking for a race-free mkdir -p... /gnu/store/b0skxv953fpsdg79cs4g9qz78ds6pvlz-profile/bin/mkdir -p
checking for gawk... gawk
checking whether make sets $(MAKE)... yes
checking whether make supports nested variables... yes
checking how to create a pax tar archive... gnutar
checking whether make supports nested variables... (cached) yes
checking for a sed that does not truncate output... /gnu/store/b0skxv953fpsdg79cs4g9qz78ds6pvlz-profile/bin/sed
checking for a Python interpreter with version >= 3.5... python
checking for python... /gnu/store/b0skxv953fpsdg79cs4g9qz78ds6pvlz-profile/bin/python
checking for python version... 3.10
checking for python platform... linux
checking for GNU default python prefix... ${prefix}
checking for GNU default python exec_prefix... ${exec_prefix}
checking for python script directory (pythondir)... ${PYTHON_PREFIX}/lib/python3.10/site-packages
checking for python extension module directory (pyexecdir)... ${PYTHON_EXEC_PREFIX}/lib/python3.10/site-packages
checking python module: yaml... yes
checking for gunzip... /gnu/store/b0skxv953fpsdg79cs4g9qz78ds6pvlz-profile/bin/gunzip
configure: Using /gnu/store/b0skxv953fpsdg79cs4g9qz78ds6pvlz-profile/bin/sed as sed executable.
checking for bash... /gnu/store/b0skxv953fpsdg79cs4g9qz78ds6pvlz-profile/bin/bash
checking for snakemake... /gnu/store/b0skxv953fpsdg79cs4g9qz78ds6pvlz-profile/bin/snakemake
checking for pandoc... /gnu/store/b0skxv953fpsdg79cs4g9qz78ds6pvlz-profile/bin/pandoc
checking for STAR... /gnu/store/b0skxv953fpsdg79cs4g9qz78ds6pvlz-profile/bin/STAR
checking for hisat2... /gnu/store/b0skxv953fpsdg79cs4g9qz78ds6pvlz-profile/bin/hisat2
checking for hisat2-build... /gnu/store/b0skxv953fpsdg79cs4g9qz78ds6pvlz-profile/bin/hisat2-build
checking for multiqc... /gnu/store/b0skxv953fpsdg79cs4g9qz78ds6pvlz-profile/bin/multiqc
checking for fastp... /gnu/store/b0skxv953fpsdg79cs4g9qz78ds6pvlz-profile/bin/fastp
checking for salmon... /gnu/store/b0skxv953fpsdg79cs4g9qz78ds6pvlz-profile/bin/salmon
checking for R... /gnu/store/b0skxv953fpsdg79cs4g9qz78ds6pvlz-profile/bin/R
checking for Rscript... /gnu/store/b0skxv953fpsdg79cs4g9qz78ds6pvlz-profile/bin/Rscript
checking for bamCoverage... /gnu/store/b0skxv953fpsdg79cs4g9qz78ds6pvlz-profile/bin/bamCoverage
checking for megadepth... /gnu/store/b0skxv953fpsdg79cs4g9qz78ds6pvlz-profile/bin/megadepth
checking R package rmarkdown ... yes
checking R package knitr ... yes
checking R package ggplot2 ... yes
checking R package ggrepel ... yes
checking R package DESeq2 ... yes
checking R package DT ... yes
checking R package pheatmap ... yes
checking R package corrplot ... yes
checking R package reshape2 ... yes
checking R package plotly ... yes
checking R package scales ... yes
checking R package crosstalk ... yes
checking R package gprofiler2 ... yes
checking R package ggpubr ... yes
checking R package rtracklayer ... yes
checking R package SummarizedExperiment ... yes
checking R package tximport ... yes
checking R package rjson ... yes
checking for samtools... /gnu/store/b0skxv953fpsdg79cs4g9qz78ds6pvlz-profile/bin/samtools
configure: Environment variables will be captured.
checking that generated files are newer than configure... done
configure: creating ./config.status
config.status: creating META
config.status: creating etc/settings.yaml
config.status: creating scripts/deseqReport.Rmd
config.status: creating Makefile
config.status: creating qsub-template.sh
config.status: creating test.sh
config.status: creating tests/test_hisat2/test.sh
config.status: creating tests/test_salmon/test_salmon_index.sh
config.status: creating tests/test_salmon/test_salmon_quant.sh
config.status: creating tests/test_salmon_counts/test.sh
config.status: creating tests/test_hisat2_counts/test.sh
config.status: creating tests/test_multiqc/test.sh
config.status: creating tests/test_deseq_reports/test.sh
config.status: creating tests/test_genome_coverage/test.sh
config.status: creating pigx-rnaseq

Start pipeline:

export PIGX_UNINSTALLED="1" ; ./pigx-rnaseq -s tests/settings.yaml tests/sample_sheet.csv

This causes error when starting the pipeline on the cluster:

$ cat /fast/home/a/agosdsc/projects/pigx/pigx_rnaseq/tests/output/snakejob.check_annotation_files.1.sh.e7042208
/gnu/store/rw6n86c008xqdbjs3nk4i7ggf6srdpgs-python-wrapper-3.10.7/bin/python: No module named snakemake

 $ grep PYTHONPATH /fast/home/a/agosdsc/projects/pigx/pigx_rnaseq/tests/output/snakejob.check_annotation_files.1.sh.o7042208
PYTHONPATH=

check qsub_template

$ cat qsub-template.sh
#!/gnu/store/b0skxv953fpsdg79cs4g9qz78ds6pvlz-profile/bin/bash
# properties = {properties}

if [ 'yes' = 'yes' ]; then
    export R_LIBS_SITE="/gnu/store/b0skxv953fpsdg79cs4g9qz78ds6pvlz-profile/site-library/:/home/agosdsc/.guix-profile/site-library/:/home/agosdsc/.guix-profile/site-library/"
    export PYTHONPATH=""
fi

env

{exec_job}

A fix would be to use the PYTHONPATH provided by guix:

export PYTHONPATH=$GUIX_PYTHONPATH
rekado commented 2 months ago

That's exactly right. That's what we're doing in the Guix package:

https://git.savannah.gnu.org/cgit/guix.git/tree/gnu/packages/bioinformatics.scm#n15615

alexg9010 commented 2 months ago

What is the recommended fix then? Should we regard this as an edge case for guix-based development and recommend exporting manually via export PYTHONPATH=$GUIX_PYTHONPATH before the ./configure step?