nf-core / kmermaid

k-mer similarity analysis pipeline
https://nf-co.re/kmermaid
MIT License
19 stars 12 forks source link

[MRG] check for file size 0 #67

Closed pranathivemuri closed 4 years ago

pranathivemuri commented 4 years ago
  1. adding processes flag to make_fastqs_per_cell and other sourmash compute that actually enables parallelization which was missing before
  2. use barcodes_file if provides and rename_10x_barcodes
  3. check the uncompressed gzip file is 0 bytes instead of filtering the compressed file for an arbitrary size of 20 bytes assuming the header is that size. It can vary and be different on different operating systems.

PR checklist

Learn more about contributing: https://github.com/nf-core/kmer-similarity/tree/master/.github/CONTRIBUTING.md

pranathivemuri commented 4 years ago
N E X T F L O W  ~  version 19.07.0
Launching `main.nf` [naughty_wozniak] - revision: 603d8e2408
[2m----------------------------------------------------
                                        ,--./,-.
        ___     __   __   __   ___     /,-._.--~'
  |\ | |__  __ /  ` /  \ |__) |__         }  {
  | \| |       \__, \__/ |  \ |___     \`-._,-`-,
                                        `._,._,'
  nf-core/kmermaid v1.0.0dev
----------------------------------------------------
Run Name          : naughty_wozniak
Skip trimming?    : false
K-mer sizes       : 3,9
Molecule          : dna,protein,dayhoff
Log2 Sketch Sizes : 2,4
One Sig per Record: true
Track Abundance   : false
10x .tgz          : [https://github.com/nf-core/test-datasets/raw/kmermaid/testdata/mouse_lung.tgz, https://github.com/nf-core/test-datasets/raw/olgabot/kmermaid-unaligned-tgz-v3/testdata/mouse_brown_fat_ptprc_plus_unaligned.tgz]
10x SAM tags      : CB,UB,XC,XM,RG
10x Cell pattern  : (CB|CB):Z:([ACGT]+)(\-1)?
10x UMI pattern   : (UB|XB):Z:([ACGT]+)
Min UMI/cell      : 1
Max Resources     : 6 GB memory, 2 cpus, 2d time per job
Output dir        : results_tenx
Launch dir        : /Users/pranathivemuri/czbiohub/kmermaid
Working dir       : /Users/pranathivemuri/czbiohub/kmermaid/work
Script dir        : /Users/pranathivemuri/czbiohub/kmermaid
User              : pranathivemuri
Config Profile    : standard
Config Description: Minimal test dataset to check pipeline function
[0m----------------------------------------------------
executor >  local (6)
[2b/55a003] process > get_software_versions                                       [100%] 1 of 1, failed: 1
[8c/54015a] process > tenx_tgz_extract_bam (mouse_brown_fat_ptprc_plus_unaligned) [100%] 2 of 2 ✔
[83/391ccf] process > samtools_fastq_aligned (mouse_lung)                         [100%] 1 of 1
[2b/b0a027] process > samtools_fastq_unaligned (mouse_lung)                       [100%] 1 of 1, failed: 1
[c1/b73d2d] process > count_umis_per_cell (mouse_lung__aligned)                   [100%] 1 of 1, failed: 1
[-        ] process > extract_per_cell_fastqs                                     -
[-        ] process > fastp                                                       -
[-        ] process > sourmash_compute_sketch_fastx_nucleotide                    -
[-        ] process > sourmash_compare_sketches                                   -
[0;35m[nf-core/kmermaid] Pipeline completed with errors
WARN: Task runtime metrics are not reported when using macOS without a container engine
WARN: Killing pending tasks (2)
Error executing process > 'count_umis_per_cell (mouse_lung__aligned)'

Caused by:
  Process `count_umis_per_cell (mouse_lung__aligned)` terminated with an error exit status (2)

Command executed:

  if [[ `gzip -l $(realpath mouse_lung__aligned.fastq.gz) | awk 'NR==2 {print $2}'` ne 0 ]]; then
    bam2fasta count_umis_percell \
        --filename mouse_lung__aligned.fastq.gz \
        --min-umi-per-barcode 1 \
        --cell-barcode-pattern '(CB|CB):Z:([ACGT]+)(\-1)?' \
        --molecular-barcode-pattern '(UB|XB):Z:([ACGT]+)' \
        --write-barcode-meta-csv mouse_lung__aligned__n_umi_per_cell.csv \
        --barcodes-significant-umis-file mouse_lung__aligned__barcodes.tsv
  fi

Command exit status:
  2

Command output:
  (empty)

Command error:
  .command.sh: line 2: conditional binary operator expected

Work dir:
  /Users/pranathivemuri/czbiohub/kmermaid/work/c1/b73d2ddd095a2fd316fdb10b90274d

Tip: when you have fixed the problem you can continue the execution adding the option `-resume` to the run command line