When I using awk to add up the second column in .BCstats.txt, I found the result is 8347237. However, the number of reads in the Read1.fq is 11844959, which is not equal to the BC sum.
So I want to know how zUMIs detect the barcode?By the way, my data is like smart-seq3 which have a pattern "ATTGCGCAATG“ in the start of read1.fq.
This is a snippte my yaml file:
sequence_files:
file1:
name: /home/data/231110_1/20231110SCS-1_L2_1.fq.gz #path to first file
base_definition:
- BC(12-17,33-40,56-63)
- UMI(64-69)
find_pattern: ATTGCGCAATG
file2:
name: /home/data/231110_1/20231110SCS-1_L2_2.fq.gz #path to second file
base_definition:
- cDNA(1-150)
When I using awk to add up the second column in .BCstats.txt, I found the result is 8347237. However, the number of reads in the Read1.fq is 11844959, which is not equal to the BC sum. So I want to know how zUMIs detect the barcode?By the way, my data is like smart-seq3 which have a pattern "ATTGCGCAATG“ in the start of read1.fq. This is a snippte my yaml file: