SystemsGenetics / GEMmaker

A workflow for construction of Gene Expression count Matrices (GEMs). Useful for Differential Gene Expression (DGE) analysis and Gene Co-Expression Network (GCN) construction
https://gemmaker.readthedocs.io/en/latest/
MIT License
33 stars 16 forks source link

clean_work_dirs bug #263

Closed cbmckni closed 2 months ago

cbmckni commented 2 years ago

Description of the bug

I am running GEMmaker on the PRP with a human kidney dataset stored locally in split fastq format. Just with the CORG reference for now to test.

Command used:

nextflow -C nextflow.config kuberun systemsgenetics/gemmaker -profile k8s -v pvc-storage-test --pipeline kallisto --kallisto_index_path /workspace/projects/systemsgenetics/gemmaker/assets/demo/references/CORG.transcripts.Kallisto.indexed --input /workspace/data/kidneydata/sra/*.fastq --outdir /workspace/gemmaker/output --max_cpus 8

nextflow.config to add k8s profile:

profiles {
    k8s {
        process {
            executor = "k8s"
        }
        executor {
            queueSize = 16
        }
    }
}

I get the following error, even after deleting the work dir:

[9f/c1136c] Submitted process > GEMmaker:clean_work_dirs (SRR5139395_2)
[3c/9e3485] Submitted process > GEMmaker:fastqc_1 (SRR5139414_2)
Error executing process > 'GEMmaker:clean_work_dirs (SRR5139395_2)'

Caused by:
  Process `GEMmaker:clean_work_dirs (SRR5139395_2)` terminated with an error exit status (1)

Command executed:

  for dir in /workspace/cole/work/24/1ec0308597286e9e24326f8d85c546/SRR5139395_2.Kallisto.ga; do
    if [ -e $dir ]; then
      echo "Cleaning: $dir"
      files=`find $dir -type  f `

      echo "Files to delete: $files"
      clean_work_files.sh "$files" "null"
    fi
  done

Command exit status:
  1

Command output:
  Cleaning: /workspace/cole/work/24/1ec0308597286e9e24326f8d85c546/SRR5139395_2.Kallisto.ga
  Files to delete: /workspace/cole/work/24/1ec0308597286e9e24326f8d85c546/SRR5139395_2.Kallisto.ga/run_info.json
  /workspace/cole/work/24/1ec0308597286e9e24326f8d85c546/SRR5139395_2.Kallisto.ga/abundance.h5
  /workspace/cole/work/24/1ec0308597286e9e24326f8d85c546/SRR5139395_2.Kallisto.ga/abundance.tsv
  cleaning 
  cleaning 
  cleaning 

Command error:
  ps: error while loading shared libraries: libprocps.so.4: cannot open shared object file: No such file or directory
  /workspace/projects/systemsgenetics/gemmaker/bin/clean_work_files.sh: line 15: perl: command not found
  stat: missing operand
  Try 'stat --help' for more information.
  stat: missing operand
  Try 'stat --help' for more information.
  stat: missing operand
  Try 'stat --help' for more information.
  stat: missing operand
  Try 'stat --help' for more information.
  /workspace/projects/systemsgenetics/gemmaker/bin/clean_work_files.sh: line 26: $file: ambiguous redirect
  truncate: option requires an argument -- 's'
  Try 'truncate --help' for more information.
  touch: invalid date format ‘@’
  touch: invalid date format ‘@’
  /workspace/projects/systemsgenetics/gemmaker/bin/clean_work_files.sh: line 15: perl: command not found
  stat: missing operand
  Try 'stat --help' for more information.
  stat: missing operand
  Try 'stat --help' for more information.
  stat: missing operand
  Try 'stat --help' for more information.
  stat: missing operand
  Try 'stat --help' for more information.
  /workspace/projects/systemsgenetics/gemmaker/bin/clean_work_files.sh: line 26: $file: ambiguous redirect
  truncate: option requires an argument -- 's'
  Try 'truncate --help' for more information.
  touch: invalid date format ‘@’
  touch: invalid date format ‘@’
  /workspace/projects/systemsgenetics/gemmaker/bin/clean_work_files.sh: line 15: perl: command not found
  stat: missing operand
  Try 'stat --help' for more information.
  stat: missing operand
  Try 'stat --help' for more information.
  stat: missing operand
  Try 'stat --help' for more information.
  stat: missing operand
  Try 'stat --help' for more information.
  /workspace/projects/systemsgenetics/gemmaker/bin/clean_work_files.sh: line 26: $file: ambiguous redirect
  truncate: option requires an argument -- 's'
  Try 'truncate --help' for more information.
  touch: invalid date format ‘@’
  touch: invalid date format ‘@’

Work dir:
  /workspace/cole/work/9f/c1136c3f9d20f7a30ccc0c1c813c2a

Tip: view the complete command output by changing to the process work dir and entering the command `cat .command.out`

WARN: Killing pending tasks (14)

Any idea what the issue may be? I have used a slight variation of this command in the past during the scidas workshop and had no issues.

Command used and terminal output

[cole@localhost gm]$ ./nextflow -C nextflow.config kuberun systemsgenetics/gemmaker    -profile k8s    -v pvc-storage-test   --pipeline kallisto   --kallisto_index_path /workspace/projects/systemsgenetics/gemmaker/assets/demo/references/CORG.transcripts.Kallisto.indexed   --input  /workspace/data/kidneydata/sra/*.fastq   --outdir /workspace/gemmaker/output   --max_cpus 8
Pod started: jovial-spence
N E X T F L O W  ~  version 21.10.6
Launching `systemsgenetics/gemmaker` [jovial-spence] - revision: 9a880fd991 [master]

------------------------------------------------------
                                        ,--./,-.
        ___     __   __   __   ___     /,-._.--~'
  |\ | |__  __ /  ` /  \ |__) |__         }  {
  | \| |       \__, \__/ |  \ |___     \`-._,-`-,
                                        `._,._,'
  systemsgenetics/gemmaker v2.1.0
------------------------------------------------------
Core Nextflow options
  revision                  : master
  runName                   : jovial-spence
  launchDir                 : /workspace/cole
  workDir                   : /workspace/cole/work
  projectDir                : /workspace/projects/systemsgenetics/gemmaker
  userName                  : root
  profile                   : standard
  configFiles               : /workspace/projects/systemsgenetics/gemmaker/nextflow.config, /workspace/cole/nextflow.config

Input/output options
  input                     : /workspace/data/kidneydata/sra/*.fastq
  sras                      : null
  outdir                    : /workspace/gemmaker/output
  failed_run_report_template: /workspace/projects/systemsgenetics/gemmaker/assets/failed_sra_runs.template.html
  multiqc_config_file       : /workspace/projects/systemsgenetics/gemmaker/assets/multiqc_config.yaml
  multiqc_custom_logo       : /workspace/projects/systemsgenetics/gemmaker/assets/systemsgenetics-gemmaker_logo.png

Kallisto Pipeline
  kallisto_index_path       : /workspace/projects/systemsgenetics/gemmaker/assets/demo/references/CORG.transcripts.Kallisto.indexed
  kallisto_bootstrap_samples: 0

Reference genome options
  igenomes_ignore           : true

Generic options
  publish_dir_mode          : link

Max job request options
  max_cpus                  : 8

!! Only displaying parameters that differ from the pipeline defaults !!
------------------------------------------------------
If you use systemsgenetics/gemmaker for your analysis please cite:

* The pipeline
  https://doi.org/10.5281/zenodo.3620945

* The nf-core framework
  https://doi.org/10.1038/s41587-020-0439-x

* Software dependencies
  https://github.com/systemsgenetics/gemmaker/blob/master/CITATIONS.md
------------------------------------------------------
[99/1d5196] Submitted process > GEMmaker:fastqc_1 (SRR5139394_1)
[7b/1b263a] Submitted process > GEMmaker:fastqc_1 (SRR5139394_2)
[ac/2772e0] Submitted process > GEMmaker:fastqc_1 (SRR5139395_1)
[1d/aeb630] Submitted process > GEMmaker:fastqc_1 (SRR5139395_2)
[ea/1455e0] Submitted process > GEMmaker:fastqc_1 (SRR5139396_1)
[2c/b4ef50] Submitted process > GEMmaker:kallisto (SRR5139394_1)
[9b/d39dd1] Submitted process > GEMmaker:fastqc_1 (SRR5139396_2)
[f3/349dee] Submitted process > GEMmaker:kallisto (SRR5139394_2)
[09/eebb63] Submitted process > GEMmaker:kallisto (SRR5139395_1)
[94/0f2d08] Submitted process > GEMmaker:fastqc_1 (SRR5139397_1)
[24/1ec030] Submitted process > GEMmaker:kallisto (SRR5139395_2)
[c7/bd996a] Submitted process > GEMmaker:fastqc_1 (SRR5139397_2)
[b4/d19a17] Submitted process > GEMmaker:kallisto (SRR5139396_1)
[7e/719739] Submitted process > GEMmaker:kallisto (SRR5139396_2)
[3f/311d73] Submitted process > GEMmaker:kallisto (SRR5139397_1)
[02/a66b26] Submitted process > GEMmaker:kallisto (SRR5139397_2)
[75/f12ea4] Submitted process > GEMmaker:kallisto_tpm (SRR5139395_2)
[2b/5a9a55] Submitted process > GEMmaker:next_sample (SRR5139395_2)
[9f/c1136c] Submitted process > GEMmaker:clean_work_dirs (SRR5139395_2)
[3c/9e3485] Submitted process > GEMmaker:fastqc_1 (SRR5139414_2)
Error executing process > 'GEMmaker:clean_work_dirs (SRR5139395_2)'

Caused by:
  Process `GEMmaker:clean_work_dirs (SRR5139395_2)` terminated with an error exit status (1)

Command executed:

  for dir in /workspace/cole/work/24/1ec0308597286e9e24326f8d85c546/SRR5139395_2.Kallisto.ga; do
    if [ -e $dir ]; then
      echo "Cleaning: $dir"
      files=`find $dir -type  f `

      echo "Files to delete: $files"
      clean_work_files.sh "$files" "null"
    fi
  done

Command exit status:
  1

Command output:
  Cleaning: /workspace/cole/work/24/1ec0308597286e9e24326f8d85c546/SRR5139395_2.Kallisto.ga
  Files to delete: /workspace/cole/work/24/1ec0308597286e9e24326f8d85c546/SRR5139395_2.Kallisto.ga/run_info.json
  /workspace/cole/work/24/1ec0308597286e9e24326f8d85c546/SRR5139395_2.Kallisto.ga/abundance.h5
  /workspace/cole/work/24/1ec0308597286e9e24326f8d85c546/SRR5139395_2.Kallisto.ga/abundance.tsv
  cleaning 
  cleaning 
  cleaning 

Command error:
  ps: error while loading shared libraries: libprocps.so.4: cannot open shared object file: No such file or directory
  /workspace/projects/systemsgenetics/gemmaker/bin/clean_work_files.sh: line 15: perl: command not found
  stat: missing operand
  Try 'stat --help' for more information.
  stat: missing operand
  Try 'stat --help' for more information.
  stat: missing operand
  Try 'stat --help' for more information.
  stat: missing operand
  Try 'stat --help' for more information.
  /workspace/projects/systemsgenetics/gemmaker/bin/clean_work_files.sh: line 26: $file: ambiguous redirect
  truncate: option requires an argument -- 's'
  Try 'truncate --help' for more information.
  touch: invalid date format ‘@’
  touch: invalid date format ‘@’
  /workspace/projects/systemsgenetics/gemmaker/bin/clean_work_files.sh: line 15: perl: command not found
  stat: missing operand
  Try 'stat --help' for more information.
  stat: missing operand
  Try 'stat --help' for more information.
  stat: missing operand
  Try 'stat --help' for more information.
  stat: missing operand
  Try 'stat --help' for more information.
  /workspace/projects/systemsgenetics/gemmaker/bin/clean_work_files.sh: line 26: $file: ambiguous redirect
  truncate: option requires an argument -- 's'
  Try 'truncate --help' for more information.
  touch: invalid date format ‘@’
  touch: invalid date format ‘@’
  /workspace/projects/systemsgenetics/gemmaker/bin/clean_work_files.sh: line 15: perl: command not found
  stat: missing operand
  Try 'stat --help' for more information.
  stat: missing operand
  Try 'stat --help' for more information.
  stat: missing operand
  Try 'stat --help' for more information.
  stat: missing operand
  Try 'stat --help' for more information.
  /workspace/projects/systemsgenetics/gemmaker/bin/clean_work_files.sh: line 26: $file: ambiguous redirect
  truncate: option requires an argument -- 's'
  Try 'truncate --help' for more information.
  touch: invalid date format ‘@’
  touch: invalid date format ‘@’

Work dir:
  /workspace/cole/work/9f/c1136c3f9d20f7a30ccc0c1c813c2a

Tip: view the complete command output by changing to the process work dir and entering the command `cat .command.out`

WARN: Killing pending tasks (14)
[26/029fea] Submitted process > GEMmaker:kallisto (SRR5139414_2)
WARN: To render the execution DAG in the required format it is required to install Graphviz -- See http://www.graphviz.org for more info.


### Relevant files

_No response_

### System information

nextflow version 21.10.6.5660

K8s executor

systemsgenetics/gemmaker v2.1.0
cbmckni commented 2 years ago

I get the same error running with the locally stored CORG fastq.gz files, so that should be easier to replicate:

./nextflow -C nextflow.config kuberun systemsgenetics/gemmaker    -profile k8s    -v pvc-storage-test   --pipeline kallisto   --kallisto_index_path /workspace/projects/systemsgenetics/gemmaker/assets/demo/references/CORG.transcripts.Kallisto.indexed   --input  /workspace/projects/systemsgenetics/gemmaker/assets/demo/*.fastq.gz   --outdir /workspace/gemmaker/output   --max_cpus 4
spficklin commented 2 years ago

It seems the problem is indicated with these lines in the error log:

First:

ps: error while loading shared libraries: libprocps.so.4: cannot open shared object file: No such file or directory

Followed by:

/workspace/projects/systemsgenetics/gemmaker/bin/clean_work_files.sh: line 15: perl: command not found

Then by:

stat: missing operand

I'm not sure where the ps error is coming from because the bin/clean_work_files.sh script which is being called doesn't use ps. But the perl error is occurring at this line:

https://github.com/SystemsGenetics/GEMmaker/blob/9a880fd991ae435dd4a755d19a642827ae8df701/bin/clean_work_files.sh#L15

The clean_work_dirs process doesn't use a container. It's just basic BASH code and since we figured perl was ubiquitous we didn't bother to create a container for this step. So, my thinking is that whatever container is being used on your k8s system is missing perl, libraries for ps and a different stat command.

spficklin commented 2 years ago

Perhaps, the best solution would probably be to specify a container that this step should run in, that has the correct libraries, and binaries.

spficklin commented 2 months ago

Going through and closing old issues. I believe this has been corrected.