sdparekh / zUMIs

zUMIs: A fast and flexible pipeline to process RNA sequencing data with UMIs
GNU General Public License v3.0
275 stars 68 forks source link

Total alignments : 0 #318

Closed Chuang1118 closed 2 years ago

Chuang1118 commented 2 years ago

Hello, zUMIs team,

I a new zUMIs user, I have faced the problem by using zUMIs in mapping step. Install zUMIs For reproducible reason, I have used singularity V3 from docker hub docker://tomkellygenetics/zumis:v2.9.7 Input zcat FLASH_1GC_P7_S7L001R1_001.fastq.gz | head # my read1 is cDNA

@VH00228:2:AAAT2MKM5:1:1101:20257:1057 1:N:0:CCCGTTAG
AATAATCGGGTGGGGGCACGGAATTCGAAGACCCAGCAGCGAACACAAGTCCCGGAGGCACAGAAAGGGCAAACATTCTAAATCCTTGGTAAGGGCTCCGTGCAGTAGTTAACC
+
-CCCC-CCCCC-CCCC;CC;CCCCCCCCCC;CCCCC;C;CCC;CCCCCCCC-CCCCC;-CC;CCCC-CC-CC-C-CCC;C;CC;-CCCCCCCC-CCCCCCCCCCC;CCCCCCCC
@VH00228:2:AAAT2MKM5:1:1101:27112:1057 1:N:0:CCAGTTAG
GCTGGGCTCGGGCCTGTAACAGATCGGAAGAGCACACGTCTGAACTCCAGTCACCCAGTTAGATCTGGTATTCCGTCTTCTGCTTGAAAAGACGGGGGGGGGGGGGGGGGGGGG
+
CCCCCC---C-CC;CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC;CCCCC---C;-C----;C;C--CC-;CCC;C-----CCCCCCCCCCCCCCCCCCC
@VH00228:2:AAAT2MKM5:1:1101:21640:1076 1:N:0:CCAGTTAG
GTATAGGGGTTAGTCCTTGCTATATTATGCCTTGGTTATAATTTTTCATCTTTCCCTTGCGGTACTATATCTATTGCGCCAGGTTTCAATTTCTATCGCCTATACTTTATTTGG

zcat FLASH_1GC_P7_S7L001R2_001.fastq.gz | head # my read2 is BC+UMI

@VH00228:2:AAAT2MKM5:1:1101:20257:1057 2:N:0:CCCGTTAG
ATTCAGACGTTTGGGC
+
-C--CCCCC;C-C;C-
@VH00228:2:AAAT2MKM5:1:1101:27112:1057 2:N:0:CCAGTTAG
GTTACAGGCCATAGCA
+
-CCCCCCCCCCCCC-C
@VH00228:2:AAAT2MKM5:1:1101:21640:1076 2:N:0:CCAGTTAG
TGAATACCAACAGCCT

CMD

nohup singularity exec -B /mnt/DOSI:/mnt/DOSI -B /home/path/zumiexec/:/home/zUMIs/ /mnt/DOSI/path/zumis_v2.9.7/zumis_v2.9.7.sif /home/zUMIs/zUMIs.sh -c -y zUMIs_plate7.yaml &

zUMIs part

-rwxr-xr-x 1 me utilisa. du domaine     225661 mai   18 14:58 FLASHfb5p.BCstats.txt*
-rwxr-xr-x 1 me utilisa. du domaine       4412 mai   18 15:07 FLASHfb5p.filtered.Aligned.GeneTagged.sorted.bam*
-rwxr-xr-x 1 me utilisa. du domaine       2312 mai   18 15:07 FLASHfb5p.filtered.Aligned.GeneTagged.sorted.bam.bai*
-rwxr-xr-x 1 me utilisa. du domaine       4398 mai   18 15:06 FLASHfb5p.filtered.tagged.Aligned.out.bam*
-rwxr-xr-x 1 me utilisa. du domaine    2786198 mai   18 15:06 FLASHfb5p.filtered.tagged.Aligned.toTranscriptome.out.bam*
-rwxr-xr-x 1 me utilisa. du domaine  193643646 mai   18 14:58 FLASHfb5p.filtered.tagged.bam*
-rwxr-xr-x 1 me utilisa. du domaine       5877 mai   18 15:06 FLASHfb5p.filtered.tagged.Log.final.out*
-rwxr-xr-x 1 me utilisa. du domaine  196527591 mai   18 14:58 FLASHfb5p.filtered.tagged.unmapped.bam*
-rwxr-xr-x 1 me utilisa. du domaine 1365453683 mai   18 14:58 FLASHfb5p.final_annot.gtf*
-rwxr-xr-x 1 me utilisa. du domaine       1569 mai   18 14:58 FLASHfb5p.postmap.yaml*
-rwxr-xr-x 1 me utilisa. du domaine        244 mai   18 14:57 FLASHfb5p.zUMIs_runlog.txt*
-rwxr-xr-x 1 me utilisa. du domaine        150 mai   18 14:57 FLASHfb5p.zUMIs_YAMLerror.log*
-rwxr-xr-x 1 me utilisa. du domaine      72683 mai   18 15:08 nohup.out*
drwxr-xr-x 2 me utilisa. du domaine          0 mai   18 15:06 zUMIs_output/
-rwxr-xr-x 1 me utilisa. du domaine       4780 mai   18 14:58 zUMIs_plate7.run.yaml*
-rwxr-xr-x 1 me utilisa. du domaine       4839 mai   18 14:57 zUMIs_plate7.yaml*

Both FLASHfb5p.filtered.Aligned.GeneTagged.sorted.bam and FLASHfb5p.filtered.tagged.Aligned.out.bam is empty, in other word, only have header. samtools view FLASHfb5p.filtered.tagged.unmapped.bam | head -n 5

VH00228:2:AAAT2MKM5:1:1101:36940:1076   4   *   0   0   *   *   0   0   CTCCCGGCCGCCGAGGGCGCACCACCGGCCCGTCTCGCCCGCCGCGCCGGGGAGGTGGAGCACGAGCGCACGTGTTAGGACCCGAAAGATGGGGAACTATGCCTGGGCAGGGCG  CCC-;CCCCC;CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC;CCCCCCCCCCCCCCCCCCCCCCCCCCCC;CCC;CCCCCCCC  BX:Z:TACTGGTA   BC:Z:TACTGGTA   UB:Z:AAAGCTCC   QB:Z:CCCCCCCC   QU:Z:CCCCCCCC
VH00228:2:AAAT2MKM5:1:1101:35519:1095   4   *   0   0   *   *   0   0   CTGGTATTCTTCAGCATCACAAACTCAATTACTTGTTTCAAAATGAGATACAGGTTGGTTTGGATTTACTGATAGGCTTGTCCTGATAACTAAGAATGACAGCTGTAAGGGGAG  CCC;CCCCCCCCCCCCCCCCCC-CCCCCCCCCCCCCCCCCCC;CCCC-CC;CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC  BX:Z:TCTCACAC   BC:Z:TCTCACAC   UB:Z:AATATAGG   QB:Z:CCCCCCCC   QU:Z:CCCCCCCC
VH00228:2:AAAT2MKM5:1:1101:36656:1095   4   *   0   0   *   *   0   0   AGTCTGGTCTGCGCAGTGGCCACCACCGAGTTCCCCTGCTGTCACCACCAAAGGTCCAGATCGGAAGAGCACACGTCTGAACTCCAGTCACCTAGTTAGATCTGGTATGCCGTC  CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC;CCCCC-CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC-CC;;--C;-CC--C--  BX:Z:GGACCTTT   BC:Z:GGACCTTT   UB:Z:GGTGGTGA   QB:Z:CCCCCCCC   QU:Z:CCCCCCCC
VH00228:2:AAAT2MKM5:1:1101:36693:1095   4   *   0   0   *   *   0   0   CACTTACCCCGGCGTGATAAACTTTATTCGCTCTTTTAATCTTGATGTCCAGGGCGGTCCCCATCTCCAATTCTCCGCAGCAGCCCCCCTGCTGTAGTCGTGACCCGTAAAGAT  CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC;CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC;CCCCCCCCCCCCCCCCCCCCCCC  BX:Z:TTACGGGT   BC:Z:TTACGGGT   UB:Z:CACGACTA   QB:Z:CCCCCCCC   QU:Z:CCCCCCCC
VH00228:2:AAAT2MKM5:1:1101:47449:1095   4   *   0   0   *   *   0   0   AACCCACAGACTTTGGTTTCCCGGAAGCTGCCCGGCGGGTCATGCCCTGCTGATGGCATGCCTGTAACAGATCGGAAGAGCACACGTCTGAACTCCAGTCACCCAGTTAGATCT  CCC-CCCCC-CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC;CCCCCCCCCCCCCCCC-CCCC;C-  BX:Z:GTTACAGG   BC:Z:GTTACAGG   UB:Z:CATGCCAT   QB:Z:CCCCCCCC   QU:Z:CCCCCCCC

samtools view FLASHfb5p.filtered.tagged.bam | head -n 5

VH00228:2:AAAT2MKM5:1:1101:36940:1076   4   *   0   0   *   *   0   0   CTCCCGGCCGCCGAGGGCGCACCACCGGCCCGTCTCGCCCGCCGCGCCGGGGAGGTGGAGCACGAGCGCACGTGTTAGGACCCGAAAGATGGGGAACTATGCCTGGGCAGGGCG  CCC-;CCCCC;CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC;CCCCCCCCCCCCCCCCCCCCCCCCCCCC;CCC;CCCCCCCC  BC:Z:TACTGGTA   UB:Z:AAAGCTCC   QB:Z:CCCCCCCC   QU:Z:CCCCCCCC
VH00228:2:AAAT2MKM5:1:1101:35519:1095   4   *   0   0   *   *   0   0   CTGGTATTCTTCAGCATCACAAACTCAATTACTTGTTTCAAAATGAGATACAGGTTGGTTTGGATTTACTGATAGGCTTGTCCTGATAACTAAGAATGACAGCTGTAAGGGGAG  CCC;CCCCCCCCCCCCCCCCCC-CCCCCCCCCCCCCCCCCCC;CCCC-CC;CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC  BC:Z:TCTCACAC   UB:Z:AATATAGG   QB:Z:CCCCCCCC   QU:Z:CCCCCCCC
VH00228:2:AAAT2MKM5:1:1101:36656:1095   4   *   0   0   *   *   0   0   AGTCTGGTCTGCGCAGTGGCCACCACCGAGTTCCCCTGCTGTCACCACCAAAGGTCCAGATCGGAAGAGCACACGTCTGAACTCCAGTCACCTAGTTAGATCTGGTATGCCGTC  CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC;CCCCC-CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC-CC;;--C;-CC--C--  BC:Z:GGACCTTT   UB:Z:GGTGGTGA   QB:Z:CCCCCCCC   QU:Z:CCCCCCCC
VH00228:2:AAAT2MKM5:1:1101:36693:1095   4   *   0   0   *   *   0   0   CACTTACCCCGGCGTGATAAACTTTATTCGCTCTTTTAATCTTGATGTCCAGGGCGGTCCCCATCTCCAATTCTCCGCAGCAGCCCCCCTGCTGTAGTCGTGACCCGTAAAGAT  CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC;CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC;CCCCCCCCCCCCCCCCCCCCCCC  BC:Z:TTACGGGT   UB:Z:CACGACTA   QB:Z:CCCCCCCC   QU:Z:CCCCCCCC
VH00228:2:AAAT2MKM5:1:1101:47449:1095   4   *   0   0   *   *   0   0   AACCCACAGACTTTGGTTTCCCGGAAGCTGCCCGGCGGGTCATGCCCTGCTGATGGCATGCCTGTAACAGATCGGAAGAGCACACGTCTGAACTCCAGTCACCCAGTTAGATCT  CCC-CCCCC-CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC;CCCCCCCCCCCCCCCC-CCCC;C-  BC:Z:GTTACAGG   UB:Z:CATGCCAT   QB:Z:CCCCCCCC   QU:Z:CCCCCCCC

The YAML file and full standard output zUMIs_plate7.yaml nohup.out Here, I got a lot of warning Setting locale failed. I can't handle it.

perl: warning: Falling back to the standard locale ("C").
perl: warning: Setting locale failed.
perl: warning: Please check that your locale settings:

Screenshots issue Screenshot from 2022-05-18 15-14-59

Additional message Would you provide a singularity image more recently?

Thanks, Chuang

Chuang1118 commented 2 years ago

Hello,

Update zUMIs version 2.9.7c using same YAML file. It's same error! Just warning Setting locale failed is solved.

You provided these parameters:
 YAML file: zUMIs_plate7.yaml
 zUMIs directory:       /mnt/DOSI/PMLAB/BIOINFO/scRNAseq/FB5Pseq_220414_h_huFlashseq/zUMIs/newTSO/plate7/zUMIs
 STAR executable        STAR
 samtools executable        samtools
 pigz executable        pigz
 Rscript executable     Rscript
 RAM limit:   null
 zUMIs version 2.9.7c 

mer. 18 mai 2022 18:03:46 CEST
WARNING: The STAR version used for mapping is 2.7.3a and the STAR index was created using the version 2.7.1a. This may lead to an error while mapping. If you encounter any errors at the mapping stage, please make sure to create the STAR index using STAR 2.7.3a.
Filtering...
mer. 18 mai 2022 18:04:05 CEST
[1] "441364 reads were assigned to barcodes that do not correspond to intact cells."
[1] "Found 84 daughter barcodes that can be binned into 20 parent barcodes."
[1] "Binned barcodes correspond to 31532 reads."
Mapping...
[1] "2022-05-18 18:04:10 CEST"
May 18 18:04:19 ..... started STAR run
May 18 18:04:19 ..... loading genome
May 18 18:04:19 ..... started STAR run
May 18 18:04:19 ..... loading genome
May 18 18:04:19 ..... started STAR run
May 18 18:04:19 ..... loading genome
May 18 18:06:21 ..... processing annotations GTF
May 18 18:06:21 ..... processing annotations GTF
May 18 18:06:21 ..... processing annotations GTF
May 18 18:06:37 ..... inserting junctions into the genome indices
May 18 18:06:37 ..... inserting junctions into the genome indices
May 18 18:06:37 ..... inserting junctions into the genome indices
May 18 18:10:01 ..... started 1st pass mapping
May 18 18:10:01 ..... finished 1st pass mapping
May 18 18:10:02 ..... inserting junctions into the genome indices
May 18 18:10:07 ..... started 1st pass mapping
May 18 18:10:07 ..... finished 1st pass mapping
May 18 18:10:08 ..... inserting junctions into the genome indices
May 18 18:10:17 ..... started 1st pass mapping
May 18 18:10:17 ..... finished 1st pass mapping
May 18 18:10:18 ..... inserting junctions into the genome indices
May 18 18:11:40 ..... started mapping
May 18 18:11:41 ..... finished mapping
May 18 18:11:42 ..... finished successfully
May 18 18:11:44 ..... started mapping
May 18 18:11:45 ..... finished mapping
May 18 18:11:46 ..... finished successfully
May 18 18:11:57 ..... started mapping
May 18 18:11:57 ..... finished mapping
May 18 18:11:59 ..... finished successfully
mer. 18 mai 2022 18:12:00 CEST
Counting...
[1] "2022-05-18 18:12:10 CEST"
[1] "4.5e+08 Reads per chunk"
[1] "Loading reference annotation from:"
[1] "/mnt/DOSI/PMLAB/BIOINFO/scRNAseq/FB5Pseq_220414_h_huFlashseq/zUMIs/newTSO/plate7/FLASHfb5p.final_annot.gtf"
[1] "Annotation loaded!"
Warning message:
`as_quosure()` requires an explicit environment as of rlang 0.3.0.
Please supply `env`.
This warning is displayed once per session. 
[1] "Assigning reads to features (ex)"

        ==========     _____ _    _ ____  _____  ______          _____  
        =====         / ____| |  | |  _ \|  __ \|  ____|   /\   |  __ \ 
          =====      | (___ | |  | | |_) | |__) | |__     /  \  | |  | |
            ====      \___ \| |  | |  _ <|  _  /|  __|   / /\ \ | |  | |
              ====    ____) | |__| | |_) | | \ \| |____ / ____ \| |__| |
        ==========   |_____/ \____/|____/|_|  \_\______/_/    \_\_____/
       Rsubread 1.32.4

//========================== featureCounts setting ===========================\\
||                                                                            ||
||             Input files : 1 BAM file                                       ||
||                           S FLASHfb5p.filtered.tagged.Aligned.out.bam      ||
||                                                                            ||
||              Annotation : R data.frame                                     ||
||      Assignment details : <input_file>.featureCounts.bam                   ||
||                      (Note that files are saved to the output directory)   ||
||                                                                            ||
||      Dir for temp files : .                                                ||
||                 Threads : 30                                               ||
||                   Level : meta-feature level                               ||
||              Paired-end : yes                                              ||
||      Multimapping reads : counted                                          ||
||     Multiple alignments : primary alignment only                           ||
|| Multi-overlapping reads : not counted                                      ||
||   Min overlapping bases : 1                                                ||
||                                                                            ||
||          Chimeric reads : not counted                                      ||
||        Both ends mapped : not required                                     ||
||                                                                            ||
\\===================== http://subread.sourceforge.net/ ======================//

//================================= Running ==================================\\
||                                                                            ||
|| Load annotation file .Rsubread_UserProvidedAnnotation_pid491710 ...        ||
||    Features : 328745                                                       ||
||    Meta-features : 60761                                                   ||
||    Chromosomes/contigs : 151                                               ||
||                                                                            ||
|| Process BAM file FLASHfb5p.filtered.tagged.Aligned.out.bam...              ||
||    Single-end reads are included.                                          ||
||    Assign alignments to features...                                        ||
||    Total alignments : 0                                                    ||
||    Successfully assigned alignments : 0                                    ||
||    Running time : 0.00 minutes                                             ||
||                                                                            ||
||                                                                            ||
\\===================== http://subread.sourceforge.net/ ======================//

[1] "Assigning reads to features (in)"

        ==========     _____ _    _ ____  _____  ______          _____  
        =====         / ____| |  | |  _ \|  __ \|  ____|   /\   |  __ \ 
          =====      | (___ | |  | | |_) | |__) | |__     /  \  | |  | |
            ====      \___ \| |  | |  _ <|  _  /|  __|   / /\ \ | |  | |
              ====    ____) | |__| | |_) | | \ \| |____ / ____ \| |__| |
        ==========   |_____/ \____/|____/|_|  \_\______/_/    \_\_____/
       Rsubread 1.32.4

//========================== featureCounts setting ===========================\\
||                                                                            ||
||             Input files : 1 BAM file                                       ||
||                           S FLASHfb5p.filtered.tagged.Aligned.out.bam. ... ||
||                                                                            ||
||              Annotation : R data.frame                                     ||
||      Assignment details : <input_file>.featureCounts.bam                   ||
||                      (Note that files are saved to the output directory)   ||
||                                                                            ||
||      Dir for temp files : .                                                ||
||                 Threads : 30                                               ||
||                   Level : meta-feature level                               ||
||              Paired-end : yes                                              ||
||      Multimapping reads : counted                                          ||
||     Multiple alignments : primary alignment only                           ||
|| Multi-overlapping reads : not counted                                      ||
||   Min overlapping bases : 1                                                ||
||                                                                            ||
||          Chimeric reads : not counted                                      ||
||        Both ends mapped : not required                                     ||
||                                                                            ||
\\===================== http://subread.sourceforge.net/ ======================//

//================================= Running ==================================\\
||                                                                            ||
|| Load annotation file .Rsubread_UserProvidedAnnotation_pid491710 ...        ||
||    Features : 232472                                                       ||
||    Meta-features : 27187                                                   ||
||    Chromosomes/contigs : 34                                                ||
||                                                                            ||
|| Process BAM file FLASHfb5p.filtered.tagged.Aligned.out.bam.ex.featureC ... ||
||    Single-end reads are included.                                          ||
||    Assign alignments to features...                                        ||
||    Total alignments : 0                                                    ||
||    Successfully assigned alignments : 0                                    ||
||    Running time : 0.00 minutes                                             ||
||                                                                            ||
||                                                                            ||
\\===================== http://subread.sourceforge.net/ ======================//

[1] "2022-05-18 18:13:33 CEST"
[1] "Coordinate sorting final bam file..."
[1] "2022-05-18 18:13:34 CEST"
[1] "Here are the detected subsampling options:"
[1] "Automatic downsampling"
[1] "Working on barcode chunk 1 out of 1"
[1] "Processing 23 barcodes in this chunk..."
Error in .checkTypos(e, names_x) : Object 'GEin' not found amongst ftype
Calls: reads2genes_new ... tryCatchList -> tryCatchOne -> <Anonymous> -> .checkTypos
Execution halted
mer. 18 mai 2022 18:13:38 CEST
Loading required package: yaml
Loading required package: Matrix
[1] "loomR found"
Error in gzfile(file, "rb") : cannot open the connection
Calls: rds_to_loom -> readRDS -> gzfile
In addition: Warning message:
In gzfile(file, "rb") :
  cannot open compressed file '/mnt/DOSI/PMLAB/BIOINFO/scRNAseq/FB5Pseq_220414_h_huFlashseq/zUMIs/newTSO/plate7/zUMIs_output/expression/FLASHfb5p.dgecounts.rds', probable reason 'No such file or directory'
Execution halted
mer. 18 mai 2022 18:13:41 CEST
Descriptive statistics...
[1] "I am loading useful packages for plotting..."
[1] "2022-05-18 18:13:42 CEST"
Error in gzfile(file, "rb") : cannot open the connection
Calls: readRDS -> gzfile
In addition: Warning message:
In gzfile(file, "rb") :
  cannot open compressed file '/mnt/DOSI/PMLAB/BIOINFO/scRNAseq/FB5Pseq_220414_h_huFlashseq/zUMIs/newTSO/plate7/zUMIs_output/expression/FLASHfb5p.dgecounts.rds', probable reason 'No such file or directory'
Execution halted
mer. 18 mai 2022 18:13:52 CEST
Chuang1118 commented 2 years ago

I found out the problem, /mnt/DOSI is Windows storage in my case. I saw zUMIs_output/.tmpMerge/*filtered.tagged.bam that is not empty. why not STAR output only has header, I don't know. If I modified output path by Linux storage, it works fine for me.

cziegenhain commented 2 years ago

Great that you have found this - I’ve seen other users report windows-related issues before, good to keep this in mind.

Best, Christoph