biocorecrg / indrop

Single cell transcriptome analysis pipeline based on DropEst
Mozilla Public License 2.0
6 stars 5 forks source link

Can't open file with reads #10

Open slowkow opened 4 years ago

slowkow commented 4 years ago

I got the nextflow pipeline to run, but it immediately throws an error.

It seems that the droptag tool cannot read the FASTQ files.

Do you know what might be the problem?

When I cd into the working directory, I can see the symlinks to the FASTQ files. When I run gzip -cd $file | head I can see that the files are readable.

For example, is it possible that the droptag command is unable to read .fastq.gz and instead requires .fastq? That would be unfortunate.

Thanks for any help!

$ nextflow indrop.nf
N E X T F L O W  ~  version 19.10.0
Launching `indrop.nf` [nauseous_ride] - revision: fb17c3423b
╔╗ ┬┌─┐┌─┐┌─┐┬─┐┌─┐╔═╗╦═╗╔═╗  ╦┌┐┌┌┬┐┬─┐┌─┐┌─┐╔═╗╦  ╔═╗╦ ╦
╠╩╗││ ││  │ │├┬┘├┤ ║  ╠╦╝║ ╦  ║│││ ││├┬┘│ │├─┘╠╣ ║  ║ ║║║║
╚═╝┴└─┘└─┘└─┘┴└─└─┘╚═╝╩╚═╚═╝  ╩┘└┘─┴┘┴└─└─┘┴  ╚  ╩═╝╚═╝╚╩╝

====================================================
BIOCORE@CRG indropSEQ - N F  ~  version 1.0
====================================================
pairs                         : /projects/villani_collab/Hamed_MC/fc-57e8a2f7-e291-4423-9907-b8ca4ed636db/DMb3-11/lane1*R{1,2,3,4}_001.fastq.gz
genome                        : /projects/external_data/10xgenomics.com/refdata-cellranger-GRCh38-3.0.0/fasta/genome.fa
annotation                    : /projects/external_data/10xgenomics.com/refdata-cellranger-GRCh38-3.0.0/genes/genes.gtf
config                        : /home/ks38/work/indrop/nextflow_dropest/indrop/conf/indrop_v3.xml
barcode_list                  : /home/ks38/work/indrop/nextflow_dropest/indrop/conf/indrop_v3_barcodes.txt
email                         : kslowikowski@mgh.harvard.edu
mtgenes                       : /home/ks38/work/indrop/nextflow_dropest/indrop/anno/mitoc_genes.txt
dbdir                         : /home/ks38/work/indrop/nextflow_dropest/indrop/db
version                       : 3_4
keepmulti                     : NO
library_tag                   : CTCTCTAT
output (output folder)        : output_v3
executor >  local (1)
executor >  local (1)
[-        ] process > QConRawReads                 -
[70/f81b09] process > dropTag (lane1_NoIndex_L001) [  0%] 0 of 1
[-        ] process > QCFiltReads                  -
[-        ] process > getReadLength                -
[-        ] process > buildIndex                   -
[-        ] process > mapping                      -
[-        ] process > removeMultimapping           -
[-        ] process > dropEst                      -
[-        ] process > dropReport                   -
executor >  local (1)
[-        ] process > QConRawReads                 -
[70/f81b09] process > dropTag (lane1_NoIndex_L001) [100%] 1 of 1, failed: 1 ✘
[-        ] process > QCFiltReads                  -
[-        ] process > getReadLength                -
executor >  local (1)
[-        ] process > QConRawReads                 -
[70/f81b09] process > dropTag (lane1_NoIndex_L001) [100%] 1 of 1, failed: 1 ✘
[-        ] process > QCFiltReads                  -
[-        ] process > getReadLength                -
[-        ] process > buildIndex                   -
[-        ] process > mapping                      -
[-        ] process > removeMultimapping           -
[-        ] process > dropEst                      -
[-        ] process > dropReport                   -
[-        ] process > multiQC_unfiltered           -
Pulling Singularity image docker://biocorecrg/rnaseq:1.0 [cache /home/ks38/work/indrop/nextflow_dropest/indrop/singularity/biocorecrg-rnaseq-1.0.img]
Error executing process > 'dropTag (lane1_NoIndex_L001)'

Caused by:
  Process `dropTag (lane1_NoIndex_L001)` terminated with an error exit status (134)

Command executed:

  droptag -r 0 -S -s -t CTCTCTAT -p 8 -c indrop_v3.xml lane1_NoIndex_L001_R2_001.fastq.gz lane1_NoIndex_L001_R4_001.fastq.gz lane1_NoIndex_L001_R1_001.fastq.gz lane1_NoIndex_L001_R3_001.fastq.gz

Command exit status:
  134

Command output:
  (empty)

Command error:
  INFO:    Convert SIF file to sandbox...
  terminate called after throwing an instance of 'std::runtime_error'
    what():  Can't open file with reads: 'lane1_NoIndex_L001_R2_001.fastq.gz'
  .command.sh: line 2: 48996 Aborted                 (core dumped) droptag -r 0 -S -s -t CTCTCTAT -p 8 -c indrop_v3.xml lane1_NoIndex_L001_R2_001.fastq.gz lane1_NoIndex_L001_R4_001.fastq.gz lane1_NoIndex_L001_R1_001.fastq.gz lane1_NoIndex_L001_R3_001.fastq.gz
  INFO:    Cleaning up image...

Work dir:
  /home/ks38/work/indrop/nextflow_dropest/indrop/work/70/f81b09101b856a29fb9bc199c97677

Tip: view the complete command output by changing to the process work dir and entering the command `cat .command.out`
Failed to invoke `workflow.onComplete` event handler

 -- Check script 'indrop.nf' at line: 396 or see '.nextflow.log' file for more details
lucacozzuto commented 4 years ago

Hi, no I analysed a lot of gzipped fastq files. I see some strange error: terminate called after throwing an instance of 'std::runtime_error' and 134. can it be a lack of resources (RAM, time etc) or some other thing. Can you try to run directly singularity exec -e $SINGIMAGE .command.sh ? or you can rerun asking more RAM

slowkow commented 4 years ago

There is no lack of resources on this machine.

Could I ask how you find the $SINGIMAGE variable?

lucacozzuto commented 4 years ago

sorry SINGIMAGE stands for singularity image... it is the path to your singularity image (built by NF pipeline).

slowkow commented 4 years ago

Thanks, I believe you mean that $SINGIMAGE refers to this file:

singularity/biocorecrg-indrops-0.4.img 

Here's what happens when I run the command you suggested:

$ singularity exec -e singularity/biocorecrg-indrops-0.4.img /home/ks38/work/indrop/nextflow_dropest/indrop/work/70/f81b09101b856a29fb9bc199c97677/.command.sh
INFO:    Convert SIF file to sandbox...
FATAL:   permission denied
INFO:    Cleaning up image...

If I cd into the working directory, then I get a different error:

$ cd /home/ks38/work/indrop/nextflow_dropest/indrop/work/70/f81b09101b856a29fb9bc199c97677/

$ singularity exec -e ../../../singularity/biocorecrg-indrops-0.4.img .command.sh
INFO:    Convert SIF file to sandbox...
FATAL:   ".command.sh": executable file not found in $PATH
INFO:    Cleaning up image...

Finally, if I change the argument from .command.sh to ./.command.sh then I get back the first error again:

$ singularity exec -e ../../../singularity/biocorecrg-indrops-0.4.img ./.command.sh
INFO:    Convert SIF file to sandbox...
FATAL:   permission denied
INFO:    Cleaning up image...
lucacozzuto commented 4 years ago

this is a singularity problem... So I imagine something is wrong with your installation of singularity... you can try changing the version of singularity or asking them directly... I see some problem here https://github.com/sylabs/singularity/issues/3892 but I cannot help on this

lucacozzuto commented 4 years ago

actually the real command would be:

singularity exec -e ../../../singularity/biocorecrg-indrops-0.4.img bash .command.sh