Closed ChongLu121 closed 6 years ago
Hey Chong,
Please update Rsubread and everything should be back working fine. I have successfully run zUMIs with more than a billion read pairs, so just go ahead with your large fastq.
Feel free to reopen the issue if you have any other questions.
Best, Christoph
Hi Christoph,
Same error shows up even though I have updated the Rsubread package. Anything else I could try? Thanks a lot.
Check the verbose of featureCounts again carefully within your zUMIs call. Which version number is reported? Best, Christoph
Your jobs will run on this machine.
Make sure you have more than 1,9G RAM and 4 processors available.
Your jobs will be started from filtering.
You provided these parameters: SLURM workload manager: no Summary Stats to produce: yes Start the pipeline from: filtering A custom mapped BAM: NA Custom filtered FASTQ: no Barcode read: /home/angela/zUMIs/ExampleData/barcoderead_HEK.1mio.fq.gz cDNA read: /home/angela/zUMIs/ExampleData/cDNAread_HEK.1mio.fq.gz Study/sample name: test Output directory: /home/angela/zUMIs/ExampleData Cell/sample barcode range: 1-6 UMI barcode range: 7-16 Retain cell with >=N reads: 100 Genome directory: /home/angela/SC_PRE/reference/hg38_chr22 GTF annotation file: /home/angela/SC_PRE/reference/GRCh38.84.chr22.gtf Number of processors: 4 Read length: 50 Strandedness: 0 Cell barcode Phred: 20 UMI barcode Phred: 20
Hamming Distance (UMI): 0
Hamming Distance (CellBC): 0
Plate Barcode Read: NA
Plate Barcode range: NA
Barcodes: NA
zUMIs directory: /home/angela/zUMIs
STAR executable STAR
samtools executable samtools
pigz executable pigz
Rscript executable Rscript
Additional STAR parameters:
STRT-seq data: no
InDrops data: no
Library read for InDrops: NA
Barcode read2(STRT-seq): NA
Barcode read2 range(STRT-seq): 0-0
Bases(G) to trim(STRT-seq): 3
Subsampling reads: 0
zUMIs version 0.0.4
Raw reads: 0 Filtered reads: 0
Mar 13 14:16:41 ..... Started STAR run Mar 13 14:16:41 ..... Loading genome gzip: /home/angela/zUMIs/ExampleData/test.cdnaread.filtered.fastq.gz: No such file or directory Mar 13 14:16:41 ..... Processing annotations GTF Mar 13 14:16:41 ..... Inserting junctions into the genome indices Mar 13 14:16:50 ..... Started 1st pass mapping Mar 13 14:16:50 ..... Finished 1st pass mapping Mar 13 14:16:50 ..... Inserting junctions into the genome indices Mar 13 14:16:54 ..... Started mapping gzip: /home/angela/zUMIs/ExampleData/test.cdnaread.filtered.fastq.gz: No such file or directory Mar 13 14:16:54 ..... Finished successfully Loading required package: optparse [1] "I am loading useful packages..." [1] "2018-03-13 14:16:54 CET" [1] "I am making annotations in SAF... This will take less than 3 minutes..." [1] "2018-03-13 14:17:00 CET" Import genomic features from the file as a GRanges object ... OK Prepare the 'metadata' data frame ... OK Make the TxDb object ... OK 'select()' returned 1:many mapping between keys and columns [1] "I am making count tables...This will take a while!!" [1] "2018-03-13 14:17:04 CET"
========== _____ _ _ ____ _____ ______ _____
===== / ____| | | | _ \| __ \| ____| /\ | __ \
===== | (___ | | | | |_) | |__) | |__ / \ | | | |
==== \___ \| | | | _ <| _ /| __| / /\ \ | | | |
==== ____) | |__| | |_) | | \ \| |____ / ____ \| |__| |
========== |_____/ \____/|____/|_| \_\______/_/ \_\_____/
Rsubread 1.28.1
//========================== featureCounts setting ===========================\ | Input files : 1 BAM file | S /home/angela/zUMIs/ExampleData/test.aligne ... | ||||||||
---|---|---|---|---|---|---|---|---|---|---|
Dir for temp files : . | ||||||||||
Threads : 1 | ||||||||||
Level : meta-feature level | ||||||||||
Paired-end : no | ||||||||||
Strand specific : no | ||||||||||
Multimapping reads : primary only | ||||||||||
Multi-overlapping reads : not counted | ||||||||||
Min overlapping bases : 1 | ||||||||||
\===================== http://subread.sourceforge.net/ ======================//
//================================= Running ==================================\ | Load annotation file ./.Rsubread_UserProvidedAnnotation_pid32077 ... | Features : 5737 | Meta-features : 698 | Chromosomes/contigs : 3 | ||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Process BAM file /home/angela/zUMIs/ExampleData/test.aligned.sorted.ba ... | ||||||||||||||||||
Single-end reads are included. | ||||||||||||||||||
Assign reads to features... | ||||||||||||||||||
Total reads : 0 | ||||||||||||||||||
Successfully assigned reads : 0 | ||||||||||||||||||
Running time : 0.00 minutes | ||||||||||||||||||
Read assignment finished. | ||||||||||||||||||
\===================== http://subread.sourceforge.net/ ======================//
========== _____ _ _ ____ _____ ______ _____
===== / ____| | | | _ \| __ \| ____| /\ | __ \
===== | (___ | | | | |_) | |__) | |__ / \ | | | |
==== \___ \| | | | _ <| _ /| __| / /\ \ | | | |
==== ____) | |__| | |_) | | \ \| |____ / ____ \| |__| |
========== |_____/ \____/|____/|_| \_\______/_/ \_\_____/
Rsubread 1.28.1
//========================== featureCounts setting ===========================\ | Input files : 1 BAM file | S /home/angela/zUMIs/ExampleData/test.aligne ... | ||||||||
---|---|---|---|---|---|---|---|---|---|---|
Dir for temp files : . | ||||||||||
Threads : 1 | ||||||||||
Level : meta-feature level | ||||||||||
Paired-end : no | ||||||||||
Strand specific : no | ||||||||||
Multimapping reads : primary only | ||||||||||
Multi-overlapping reads : not counted | ||||||||||
Min overlapping bases : 1 | ||||||||||
\===================== http://subread.sourceforge.net/ ======================//
//================================= Running ==================================\ | Load annotation file ./.Rsubread_UserProvidedAnnotation_pid32077 ... | Features : 26131 | Meta-features : 1190 | Chromosomes/contigs : 4 | ||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Process BAM file /home/angela/zUMIs/ExampleData/test.aligned.sorted.ba ... | ||||||||||||||||||
Single-end reads are included. | ||||||||||||||||||
Assign reads to features... | ||||||||||||||||||
Total reads : 0 | ||||||||||||||||||
Successfully assigned reads : 0 | ||||||||||||||||||
Running time : 0.00 minutes | ||||||||||||||||||
Read assignment finished. | ||||||||||||||||||
\===================== http://subread.sourceforge.net/ ======================//
Error in data.table::fread(paste("cut -f4 ", abamfile[2], ".featureCounts", :
File is empty: /dev/shm/file7d4d29bc49fb
Calls: makeGEprofile ->
zUMI version: 0.0.4 Rsubread version: 1.28.1
I was testing with example data. Is it possible that it might be caused by "$/zUMIs/ExampleData/test.cdnaread.filtered.fastq.gz not found". I was wondering is this a warning or an error??
Thanks. :)
Hey,
In this log, the Rsubread version does not seem to be the issue.
To me it looks like something is wrong with the input files, as already the filtering step says
Raw reads: 0 Filtered reads: 0
Please redownload the fastq files from github and double check that the path is correct.
Best, Christoph
Thanks for quick response. :+1: Just re-downloaded fastq files and double checked the input path. But same issue: Raw reads: 0 Filter reads: 0
bash ~/zUMIs/zUMIs-master.sh -f ~/zUMIs/ExampleData/barcoderead_HEK.1mio.fq.gz -r ~/zUMIs/ExampleData/cDNAread_HEK.1mio.fq.gz -c 1-6 -m 7-16 -l 50 -g ~/SC_PRE/reference/hg38_chr22/ -a ~/SC_PRE/reference/GRCh38.84.chr22.gtf -n chr22test -p 8 -o ~/zUMIs/ExampleData
Hi, Can you send md5sum and ls -lh of the downloaded fastq files?
md5sum ~/zUMIs/ExampleData/barcoderead_HEK.1mio.fq.gz md5sum ~/zUMIs/ExampleData/cDNAread_HEK.1mio.fq.gz
ls -lh ~/zUMIs/ExampleData/
Thank you Swati
total 295228 drwxrwxr-x 5 angela angela 4096 Mär 14 09:16 ./ drwxrwxr-x 4 angela angela 4096 Mär 14 09:06 ../ -rwxrwxr-x 1 angela angela 20900758 Mär 14 09:02 barcoderead_HEK.1mio.fq.gz -rwxrwxr-x 1 angela angela 48684242 Mär 14 09:02 cDNAread_HEK.1mio.fq.gz -rw-rw-r-- 1 angela angela 463 Mär 14 09:15 chr22test.aligned.sorted.bam -rw-rw-r-- 1 angela angela 0 Mär 14 09:15 chr22test.aligned.sorted.bam.ex.featureCounts -rw-rw-r-- 1 angela angela 0 Mär 14 09:15 chr22test.aligned.sorted.bam.in.featureCounts -rw-rw-r-- 1 angela angela 24 Mär 14 09:15 chr22test.barcodelist.filtered.sort.sam -rw-rw-r-- 1 angela angela 0 Mär 14 09:15 chr22test.barcoderead.filtered.fastq -rw-rw-r-- 1 angela angela 0 Mär 14 09:15 chr22test.cdnaread.filtered.fastq -rw-rw-r-- 1 angela angela 1645 Mär 14 09:15 chr22test.Log.final.out -rw-rw-r-- 1 angela angela 52903 Mär 14 09:15 chr22test.Log.out -rw-rw-r-- 1 angela angela 329 Mär 14 09:15 chr22test.Log.progress.out -rw-rw-r-- 1 angela angela 0 Mär 14 09:15 chr22test.SJ.out.tab drwx------ 2 angela angela 4096 Mär 14 09:15 chr22test._STARgenome/ drwx------ 2 angela angela 4096 Mär 14 09:15 chr22test._STARpass1/ -rwxrwxr-x 1 angela angela 44426144 Mär 14 09:02 test.aligned.sorted.bam -rwxrwxr-x 1 angela angela 53060473 Mär 14 09:02 test.aligned.sorted.bam.ex.featureCounts -rwxrwxr-x 1 angela angela 53306382 Mär 14 09:02 test.aligned.sorted.bam.in.featureCounts -rwxrwxr-x 1 angela angela 81483817 Mär 14 09:02 test.barcodelist.filtered.sort.sam -rwxrwxr-x 1 angela angela 1836 Mär 14 09:02 test.Log.final.out -rwxrwxr-x 1 angela angela 55510 Mär 14 09:02 test.Log.out -rwxrwxr-x 1 angela angela 683 Mär 14 09:02 test.Log.progress.out -rwxrwxr-x 1 angela angela 279621 Mär 14 09:02 test.SJ.out.tab drwxrwxr-x 5 angela angela 4096 Mär 14 09:02 zUMIs_output/
angela@angela-MS-7817:~/zUMIs/ExampleData$ md5sum barcoderead_HEK.1mio.fq.gz 03d350700dbe1f516200687dcf8125da barcoderead_HEK.1mio.fq.gz angela@angela-MS-7817:~/zUMIs/ExampleData$ md5sum cDNAread_HEK.1mio.fq.gz c0d5f3eba4e3cf0c14aed506d0a49f30 cDNAread_HEK.1mio.fq.gz
Thank you very much Swati! :)
Okay, that seems fine to me. One more thing, did you install pigz?
If so, can you tell me the output of
pigz -dc barcoderead_HEK.1mio.fq.gz | wc -l
I have installed the pigz but didn't change its path. NICE! :+1: it's working now! Thanks a lot!
I am glad that it works!! :) Thank you for using zUMIs and let us know if you need any further help.
Dear Christoph and Swati,
I used the new version zUMIs today with the example data but got a strange error which didn't occur in the last zUMIs version. On our server, the previous version of zUMIs ran without any errors.
Here is the standard out:
You provided these parameters: SLURM workload manager: no Summary Stats to produce: yes Start the pipeline from: filtering A custom mapped BAM: NA Custom filtered FASTQ: no Barcode read: /scratch/lu/zUMIs/ExampleData/barcoderead_HEK.1mio.fq.gz cDNA read: /scratch/lu/zUMIs/ExampleData/cDNAread_HEK.1mio.fq.gz Study/sample name: chr22test Output directory: /scratch/lu/genome-index/out Cell/sample barcode range: 1-6 UMI barcode range: 7-16 Retain cell with >=N reads: 100 Genome directory: /scratch/lu/genome-index/genome/reference/hg38_chr22 GTF annotation file: /scratch/lu/genome-index/genome/reference/GRCh38.84.chr22.gtf Number of processors: 8 Read length: 50 Strandedness: 0 Cell barcode Phred: 20 UMI barcode Phred: 20
bases below phred in CellBC: 1
bases below phred in UMI: 1
Hamming Distance (UMI): 0 Hamming Distance (CellBC): 0 Plate Barcode Read: NA Plate Barcode range: NA Barcodes: NA zUMIs directory: /scratch/lu/zUMIs STAR executable STAR samtools executable samtools pigz executable pigz Rscript executable Rscript Additional STAR parameters: STRT-seq data: no InDrops data: no Library read for InDrops: NA Barcode read2(STRT-seq): NA Barcode read2 range(STRT-seq): 0-0 Bases(G) to trim(STRT-seq): 3 Subsampling reads: 0
zUMIs version 0.0.4
Raw reads: 1000000 Filtered reads: 853296
Feb 26 18:31:21 ..... started STAR run Feb 26 18:31:21 ..... loading genome Feb 26 18:31:24 ..... processing annotations GTF Feb 26 18:31:24 ..... inserting junctions into the genome indices Feb 26 18:31:48 ..... started 1st pass mapping Feb 26 18:34:41 ..... finished 1st pass mapping Feb 26 18:34:41 ..... inserting junctions into the genome indices Feb 26 18:35:05 ..... started mapping Feb 26 18:38:01 ..... finished successfully [bam_sort_core] merging from 0 files and 8 in-memory blocks... [bam_sort_core] merging from 0 files and 8 in-memory blocks... Loading required package: optparse [1] "I am loading useful packages..." [1] "2018-02-26 18:38:10 CET" [1] "I am making annotations in SAF... This will take less than 3 minutes..." [1] "2018-02-26 18:38:27 CET" Import genomic features from the file as a GRanges object ... OK Prepare the 'metadata' data frame ... OK Make the TxDb object ... OK Warning message: In .get_cds_IDX(type, phase) : The "phase" metadata column contains non-NA values for features of type stop_codon. This information was ignored. 'select()' returned 1:many mapping between keys and columns [1] "I am making count tables...This will take a while!!" [1] "2018-02-26 18:38:39 CET"
\===================== http://subread.sourceforge.net/ ======================//
\===================== http://subread.sourceforge.net/ ======================//
\===================== http://subread.sourceforge.net/ ======================//
\===================== http://subread.sourceforge.net/ ======================//
cut: /scratch/lu/genome-index/out/chr22test.aligned.sorted.bam.ex.featureCounts: No such file or directory Error in data.table::fread(paste("cut -f4 ", abamfile[2], ".featureCounts", : File is empty: /dev/shm/file69f3590aefa4 Calls: makeGEprofile ->
In addition: Warning messages:
1: In paste("readSummary", ann, files_C, fout, as.numeric(isPairedEnd), :
NAs introduced by coercion
2: In paste("readSummary", ann, files_C, fout, as.numeric(isPairedEnd), :
NAs introduced by coercion
Execution halted
[1] "I am loading useful packages for plotting..."
[1] "2018-02-26 18:38:46 CET"
Error in gzfile(file, "rb") : cannot open the connection
Calls: readRDS -> gzfile
In addition: Warning message:
In gzfile(file, "rb") :
cannot open compressed file '/scratch/lu/genome-index/out/zUMIs_output/expression/chr22test.dgecounts.rds', probable reason 'No such file or directory'
Execution halted
And I still have another question, if the fastq data is very large (about 26 Gb), would you suggest that I split those input files into small files?
All the best, Chong