Hi,
Recently, I am doing some analysis of Nanopore cDNA PCR data. When I processed NGS data, I used Picard to remove duplicates, so I also used picard to process ONT cDNA, and it turned out that most of the reads have been removed, so I don't know if picard is correct to process ONT cDNA PCR data. Here is the command line i used:
$ samtools flagstats 293.minimap2.cs.sort.dedup.bam
13272796 + 0 in total (QC-passed reads + QC-failed reads)
0 + 0 secondary
0 + 0 supplementary
9213321 + 0 duplicates
13272796 + 0 mapped (100.00% : N/A)
0 + 0 paired in sequencing
0 + 0 read1
0 + 0 read2
0 + 0 properly paired (N/A : N/A)
0 + 0 with itself and mate mapped
0 + 0 singletons (N/A : N/A)
0 + 0 with mate mapped to a different chr
0 + 0 with mate mapped to a different chr (mapQ>=5)
I found that there are too many duplicates. I don't know whether picard is effective for ONT data. Does anyone have any experience in using other tools?
Hi, Recently, I am doing some analysis of Nanopore cDNA PCR data. When I processed NGS data, I used Picard to remove duplicates, so I also used picard to process ONT cDNA, and it turned out that most of the reads have been removed, so I don't know if picard is correct to process ONT cDNA PCR data. Here is the command line i used:
{minimap2} -t {processor} -ax splice --secondary=no --cs {ref} {out_name}.min120.Q12.fastq -o {out_name}.minimap2.cs.sam
{java} -jar {picard} MarkDuplicates -I {out_name}.minimap2.cs.unique.sort.bam -O {out_name}.minimap2.cs.sort.makdup.bam -M {out_name}_duplicate_metric --VALIDATION_STRINGENCY SILENT --TMP_DIR ./
Here is the results:
![Uploading Snipaste_2024-08-21_13-00-05.png…]()
$ samtools flagstats 293.minimap2.cs.sort.dedup.bam 13272796 + 0 in total (QC-passed reads + QC-failed reads) 0 + 0 secondary 0 + 0 supplementary 9213321 + 0 duplicates 13272796 + 0 mapped (100.00% : N/A) 0 + 0 paired in sequencing 0 + 0 read1 0 + 0 read2 0 + 0 properly paired (N/A : N/A) 0 + 0 with itself and mate mapped 0 + 0 singletons (N/A : N/A) 0 + 0 with mate mapped to a different chr 0 + 0 with mate mapped to a different chr (mapQ>=5)
I found that there are too many duplicates. I don't know whether picard is effective for ONT data. Does anyone have any experience in using other tools?
Would appreciate help!
Thanks, Jean