ohsu-cedar-comp-hub / WGS-nextflow-workflow

Apache License 2.0
3 stars 1 forks source link

fastqc image discrepancies #33

Closed rlancaster96 closed 5 months ago

rlancaster96 commented 5 months ago

fastqc image made by ruben and pushed to quay fails to load environment properly

Command error:
  .command.sh: line 2: fastqc: command not found
rlancaster96 commented 5 months ago

Existing issue?

Current Behavior

Command run:

run fastqc.nf \
> -params-file tumor_params.json \
> -c nextflow.config \
> -with-singularity fastqc.sif

Nextflow errors out during the first process with a command error:

Command error:
  .command.sh: line 2: fastqc: command not found

Also cannot get fastqc to run in singularity outside of nextflow.

singularity exec fastqc.sif fastqc --help
FATAL:   "fastqc": executable file not found in $PATH

Contents of docker build file

FROM ubuntu:bionic-20180426

RUN apt-get update \
    && apt-get install -y \
       libfontconfig1 \
       openjdk-11-jre-headless \
       perl-modules \
       unzip \
       wget \
    && apt-get clean \
    && wget https://www.bioinformatics.babraham.ac.uk/projects/fastqc/fastqc_v0.11.7.zip \
    && unzip fastqc_v0.11.7.zip \
    && rm *.zip \
    && mv FastQC /usr/local/ \
    && chmod 755 /usr/local/FastQC/fastqc \
    && sed -i 's/kmer[[:space:]]\+ignore[[:space:]]\+1/kmer ignore 0/' /usr/local/FastQC/Configuration/limits.txt \
    && sed -i 's/assistive_technologies/#assistive_technologies/' /etc/java-11-openjdk/accessibility.properties \
    && rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*

I built the docker file locally, committed and pushed to quay, and pulled from quay to make a singularity image on exacloud. Have done this successfully for some other images in the workflow.

Environment Nextflow: version 23.10.1 build 5891 Singularity: version 3.8.0-1.el7

lbeckman314 commented 5 months ago

Hmm... Looks like the fastqc executable should be at /usr/local/FastQC/fastqc. I know @kellrott mentioned that Singularity doesn't take the $PATH into consideration when launching programs, so I wonder if we'll see any difference by changing the singularity command to:

# Full Path of fastqc
singularity exec fastqc.sif /usr/local/FastQC/fastqc --help

If this command works then it might be worth adding fastqc to your $PATH in order for the Nextflow command to find it:

# Update PATH
export PATH="$PATH:/usr/local/FastQC"

# Run Nextflow Command
run fastqc.nf -params-file tumor_params.json -c nextflow.config -with-singularity fastqc.sif

If you share the output of running the commands above we can narrow in on what may be the issue!

rlancaster96 commented 5 months ago
#Updated Path
export PATH="$PATH:/usr/local/FastQC"

#Confirm Path update
echo $PATH
# Fastqc is now in my path
/opt/singularity/3.8.0/bin:/usr/lib64/qt-3.3/bin:/home/users/lancasru/perl5/bin:/home/users/lancasru/.vscode-server/cli/servers/Stable-e170252f762678dec6ca2cc69aba1570769a5d39/server/bin/remote-cli:/home/users/lancasru/.sdkman/candidates/java/current/bin:/home/groups/CEDAR/lancasru/anaconda3/envs/nextflow_only/bin:/home/groups/CEDAR/lancasru/anaconda3/condabin:/home/exacloud/software/spack/bin:/usr/local/bin:/usr/bin:/opt/puppetlabs/bin:/opt/dell/srvadmin/bin:/usr/local/sbin:/usr/sbin:/usr/local/FastQC:/usr/local/FastQC

Error persists when running nextflow:

# Run nextflow
(nextflow_only) [lancasru@exanode-11-27 config_sif]$ nextflow run /home/groups/CEDAR/lancasru/WGS_COH_NF/WGS-nextflow-workflow/workflows/qc/fastqc.nf -params-file /home/groups/CEDAR/lancasru/WGS_COH_NF/nextflow_test/references/params_files/tumor_params.json -c /home/groups/CEDAR/lancasru/WGS_COH_NF/config_sif/nextflow.config -with-singularity /home/groups/CEDAR/lancasru/WGS_COH_NF/config_sif/fastqc.sif
N E X T F L O W  ~  version 23.10.1
Launching `/home/groups/CEDAR/lancasru/WGS_COH_NF/WGS-nextflow-workflow/workflows/qc/fastqc.nf` [tiny_wescoff] DSL2 - revision: 6f8df88e57
executor >  local (1)
[01/a8b2b2] process > fastQC [100%] 1 of 1, failed: 1 ✘
ERROR ~ Error executing process > 'fastQC'

Caused by:
  Process `fastQC` terminated with an error exit status (127)

Command executed:

  fastqc -o /home/groups/CEDAR/lancasru/WGS_COH_NF/nextflow_test/output/tumornormal_output/fastqc tumor_chr20_R1.fq.gz  
  fastqc -o /home/groups/CEDAR/lancasru/WGS_COH_NF/nextflow_test/output/tumornormal_output/fastqc tumor_chr20_R2.fq.gz

Command exit status:
  127

Command output:
  (empty)

Command error:
  .command.sh: line 2: fastqc: command not found

Work dir:
  /home/groups/CEDAR/lancasru/WGS_COH_NF/config_sif/work/01/a8b2b272ae5a15f266393e80e578a4

Tip: you can replicate the issue by changing to the process work dir and entering the command `bash .command.run`

 -- Check '.nextflow.log' file for details
lbeckman314 commented 5 months ago

Looks like these are the calls to fastqc that are failing in fastqc.nf:

#!/usr/bin/env nextflow

// Define the process for running FastQC
process fastQC {
    publishDir "${params.outdir}/fastqc", mode: 'copy'

    input:
    path read1
    path read2
    val id

    script:
    """
     fastqc -o ${params.outdir}/fastqc ${read1}  <--- fastqc not found?
     fastqc -o ${params.outdir}/fastqc ${read2} 
    """
}
rlancaster96 commented 5 months ago

I think I don't have my nextflow project set up correctly... running nextflow list returns (none). I'm looking into this now.

Also, I changed the .nf script to state the absolute path for the fastqc command and it worked.

    script:

    """
     /usr/local/FastQC/fastqc -o ${params.outdir}/fastqc ${read1}  
     /usr/local/FastQC/fastqc -o ${params.outdir}/fastqc ${read2}
    """
lbeckman314 commented 5 months ago

Nice! And I was incorrect with the nextflow list command — that shouldn't be necessary and in fact doesn't return any helpful $PATH info.

Great work adding the absolute path for the fastqc command, wonderful to see that's working!

If for any reason you can't update Nextflow files to use the absolute path (or you encounter this issue again) you can also update the $PATH for Nextflow workflows by adding the following env section to a Nextflow config file:

env {
    PATH = "$PATH:/usr/local/FastQC/"
}