icebert / eccDNA_RCA_nanopore

eccDNA identification from nanopore long reads of rolling-circle amplicon
MIT License
5 stars 4 forks source link

The results of the data part are inconsistent #6

Open Le1223 opened 1 year ago

Le1223 commented 1 year ago

I used porechop to process the SRR13602435.fastq file, but there were no changes in the resulting fastq file.

Then, I aligned the SRR13602435.fastq to mm10.combined.fa using minimap2 V2.17. After running eccDNA_RCA_nanopore, the data from the info file (fullpass1 reads, unique eccDNA) differs from the results reported in the article. image

image

command: porechop -i SRR13602435.fastq -o trim.fastq --extra_end_trim 0 --discard_middle /minimap2-2.17/minimap2 -cx map-ont mm10.combined.fa SRR13602435.fastq -t 16 --secondary=no >mapping.paf python eccDNA_RCA_nanopore.py --fastq SRR13602435.fastq --paf mapping.paf --reference mm10.combined.fa --info info --seq seq --var var --verbose --minDP 4 --minAF 0.75 --maxOffset 20 --minMapQual 30 |tee out.log

fullpass1 reads: less info |awk '$2>=1'|wc -l unique eccDNA: less info |awk '$2>=2'|cut -f 6|sort|uniq|wc -l

What could this issue be?

icebert commented 1 year ago

The fullpass >= 1 is slightly different which may be caused by some random effects in minimap2.

When counting the unique eccDNA, we removed the strand information for single fragment eccDNA.

For multi-fragment eccDNA, it's a little complicated:

A(+)B(+) and A(-)B(-) are the same

A(+)B(+) and A(+)B(-) are different

A(+)B(+)C(+) and B(+)C(+)A(+) are the same because of the circle