Open lopezCascales opened 1 month ago
sorry I didnt copy the unmapped reads
UNMAPPED READS:
Number of reads unmapped: too many mismatches | 0
% of reads unmapped: too many mismatches | 0.00%
Number of reads unmapped: too short | 50691944
% of reads unmapped: too short | 99.64%
Number of reads unmapped: other | 13601
% of reads unmapped: other | 0.03%
CHIMERIC READS:
Number of chimeric reads | 0
% of chimeric reads | 0.00%
Hi Alex, I have some issues with a data that I have receipt from collaborators, I have normally worked with Smart-seq plate based protocols, but this samples are 10x, and I don't have a lot of information. I have followed some others issues workflows, but I dont get why I cant solved the problem. I dont know if Im doing well, but Im goind to copy more or less the steps that Im following. With the fastq that I have, I can see that I have sequencer specific @Kxxxx - HiSeq 3000(?)/4000, and Flowcells ending with BBXX? HiSeq 3000/4000 run.
Example of 1 sample- > gunzip *.fastq.gz
cat file1.fastq file2.fastq > bigfile.fastq
cat file.fastq | head -n40
@K00360:651:HHKHYBBXY:1:1101:3640:1086 1:N:0:NCTCGTTT
NCTCGTTT
+
##################################################### The only information of the sample hashtag_oligo|well10X|RunID TGTCTTTCCTGCCAG | 3 | 20200811-SCS47-2
With that I supposed that I need to use this index for the barcodes 3M-february-2018.txt I did a Genome index for Human of 100 ( I don't know if its enough) ##################################################### Based on different answers on forums, I used this code for STAR
STAR --genomeDir ./indexHuman100 --readFilesIn 20200811-SCS47-2-HT_S4_R2.fastq 20200811-SCS47-2-HT_S4_R1.fastq --outFileNamePrefix scRNA20200811-SCS47-2-HT --outFilterType BySJout --outFilterMultimapNmax 20 --alignIntronMax 100000 --outFilterMismatchNmax 4 --outFilterMatchNminOverLread 0.3 --outFilterScoreMinOverLread 0.3 --outFilterScoreMin 30 --alignEndsType Local --soloType CB_UMI_Simple --soloCBstart 1 --soloCBlen 16 --soloUMIstart 17 --soloUMIlen 12 --soloCBmatchWLtype 1MM_multi_Nbase_pseudocounts --soloUMIfiltering MultiGeneUMI_CR --soloUMIdedup 1MM_CR --runThreadN 128 --clipAdapterType CellRanger4 --outSAMtype BAM SortedByCoordinate --outSAMattributes CR UR CY UY CB UB NH HI GX GN --soloFeatures Gene --soloCBwhitelist 3M-february-2018.txt
EXITING because of FATAL ERROR in input read file: the total length of barcode sequence is 151 not equal to expected 28 Read ID=@K00360:651:HHKHYBBXY:1:1101:3640:1086 ; Sequence=NGGTACATCGGTAATTCCCTTTCGAGGTTTGCTAGGACCGGCNGTANAGNCCGANGGCTNNACATCTGGCAACCGNANTTCATNANANCNGAAGAGNANACGNCTGAACTCCAGTCACTCTCGTTTATCTCGTATGCCGTCTTCTGCTTGA SOLUTION: check the formatting of input read files. If UMI+CB length is not equal to the barcode read length, specify barcode read length with --soloBarcodeReadLength To avoid checking of barcode read length, specify --soloBarcodeReadLength 0
######################################################## --soloBarcodeReadLength 150
--soloBarcodeReadLength 151 I add this 2 options, the firs its not working, the second one worked, Aug 22 18:19:57 ..... started STAR run Aug 22 18:19:58 ..... loading genome Aug 22 18:20:42 ..... started mapping Aug 22 18:26:19 ..... finished mapping Aug 22 18:26:20 ..... started Solo counting Aug 22 18:26:36 ..... finished Solo counting Aug 22 18:26:36 ..... started sorting BAM Aug 22 18:26:39 ..... finished successfully
######################################################## But this is the log out file,
Number of reads unmapped: too many mismatches | 0 % of reads unmapped: too many mismatches | 0.00% Number of reads unmapped: too short | 50691944 % of reads unmapped: too short | 99.64% Number of reads unmapped: other | 13601 % of reads unmapped: other | 0.03% CHIMERIC READS: Number of chimeric reads | 0 % of chimeric reads | 0.00%
For the solo.out noNoAdapter 0 noNoUMI 0 noNoCB 0 noNinCB 0 noNinUMI 4566 noUMIhomopolymer 1216 noNoWLmatch 316720 noTooManyMM 0 noTooManyWLmatches 0 yesWLmatchExact 49288951 yesOneWLmatchWithMM 476773 yesMultWLmatchWithMM 785401
and the matrix.txt
%%MatrixMarket matrix coordinate integer general % 62710 6794880 81035
#########################################################
There are not mapped reads. Do you have any suggestions? Thank you in advance for your help Have a nice day. Mayte