Open pdimens opened 8 months ago
Sorry for the confusion. The barcode lengths are 16 bp, 30 bp, and 18 bp for 10x Genomics, stLFR, and TELL-Seq sequencing technologies, respectively. The final item "Barcode_Length" is a manually annotated label and it does not affect the outcome of the MKFQ function. We have removed this unnecessary label and updated the simulationDB.zip.
Thank you for this clarification. In addition, may you please clarify what is meant by "coverage for long fragment" and "average number of molecules per droplet"?
The term "Coverage for Long Fragment " refers to the average coverage of sequencing reads across a DNA fragment. To calculate the value, we initially infer the genomic coordinates for the start and end points of a fragment, and then we calculate the average coverage of reads spanning the fragment. Please check the article "Assessment of human diploid genome assembly with 10x Linked-Reads data" for further detailed investigation.
The term “average number of molecules per droplet” is often used for 10x Genomics linked-read sequencing. For this technology, 1 ng of high–molecular weight (HMW) genomic DNA is distributed across more than 100,000 droplets, resulting millions of DNA molecules during sequencing. Consequently, one droplet could contain from a few to several hundred DNA molecules. The average number of molecules per droplet may be used to assess the quality of linked-read sequencing libraries. For detailed examples and comprehensive statistics, you may check the article "Haplotyping germline and cancer genomes with high-throughput linked-read sequencing".
I understand, thank you
Good afternoon,
In the config file for
MKFQ
, there is a final parametersBarcode_Length
. In the example files provided by google drive, the barcode length is54
for the 10x example. I have some confusion around this because 10x barcodes are 16bp. Can you please explain whatBarcode_Length
accomplishes?