iprada / Circle-Map

A method for circular DNA detection based on probabilistic mapping of ultrashort reads
MIT License
60 stars 18 forks source link

NGmerge or fastqc shoud be used in quality control before alignment #40

Open pangxueyu233 opened 3 years ago

pangxueyu233 commented 3 years ago

Hi, It's really interesting tool to identify circle DNA from DNAseq data. But, recently, we are considered with the variety of circle DNA length and length of insenrtion of reads, which would influence the usage of quanlity control tools. These library sizes of circle DNA were like ATACseq results, the inseration length of DNA fragments could be less than 50bp or more than 150bp. And in ATACseq quality control, we were suggested to use NGmerge, which could keep the short reads reminding, to cut adapters and filter reads, instead of fastqc (or other trim tools in standard pipeline) which will remove shorted reads directly. So, we also want to know whether trim tools we need to use to finish quality control before alignment. Thanks!

pangxueyu233 commented 3 years ago

Another question; should we implemented picard tools to remove duplicates reads caused by PCR after bwa alignment?

iprada commented 3 years ago

Dear Pangxueyu233,

I am out of the office this days. I will get back to you at the end of the week

Sorry for the inconvinience.

Best,

Iñigo

iprada commented 3 years ago

Dear @pangxueyu233,

I am back.

Regarding:

So, we also want to know whether trim tools we need to use to finish quality control before alignment.

You can use any standard trimming software to remove adapters on the reads. Regarding more aggresive trimming such as removing low quality sequences or too short sequences, I think that this is not strickly necessary. Circle-Map incorporates sequencing errors and sequence length (implicitly) into the probabilistic model. Long story short, I think you are fine with removing the adapters.

You should not use any software that merges paired-reads. This is not supported.

Regarding:

should we implemented picard tools to remove duplicates reads caused by PCR after bwa alignment?

This depends on the biological question you want to answer. If you are interested in a qualitative analysis where you only detect the circle keeping PCR duplicates is fine. If you want to do any quantitative analysis you should remove duplicates.

Best,

Inigo

pangxueyu233 commented 3 years ago

Thanks, I will follow your suggestion. And one more quenstion, whether this tool could identify ecDNA (Extrachromosomal DNA) not only eccDNA? And are there any tools to visualise Circle-Map results? As you known, if we constructed a circle sequence, we need to show our results like a plasmid map to visualise stucture information including gene body overlapped in circle sequence and genes position in circles structure.

iprada commented 3 years ago

Hi,

Regarding:

And one more quenstion, whether this tool could identify ecDNA (Extrachromosomal DNA) not only eccDNA?

Circle-Map is designed to detect any type of circular DNA. If you want to reconstruct the amplicon structure take a look at https://github.com/virajbdeshpande/AmpliconArchitect

Regarding:

And are there any tools to visualise Circle-Map results? As you known, if we constructed a circle sequence, we need to show our results like a plasmid map to visualise stucture information including gene body overlapped in circle sequence and genes position in circles structure.

@RAHenriksen from our lab has generated beautiful Circos plots of circles. Take a look at it.

Best,

Inigo

pangxueyu233 commented 3 years ago

Thanks for immediately replying. And one more thing I want to make sure about the results, each rows of files meant they could fomed an eccDNA individually, or combination of them could formed an eccDNA ?

iprada commented 3 years ago

Each row should be an individual eccDNA. If some of eccDNA formed a complex amplicon they will be reported as indendent lines. That is limitation of Circle-Map. You can use AmpliconArchitect to solve those.