njaupan / ecc_finder

a tool to detect eccDNA using Illumina and ONT sequencing
GNU General Public License v3.0
14 stars 5 forks source link

Is there an option to specify the size of eccDNA to detect #8

Open jiajinlongkang opened 2 years ago

jiajinlongkang commented 2 years ago

Dear njaupan,

Is there an option in ecc_finder that allows users to specify the max length of eccDNA to detect? I wonder if I can use it to identify long eccDNA with the size of 1-3 Mb. Currently what I got are all below 150 kb.

Thanks, Jack

njaupan commented 2 years ago

Hi Jack, Thank you for raising this issue, which I am also considering to include in a later version.

ecc_finder was originally tested in plants, where transposable elements that can produce eccDNA typically amount to less than 50 kb. However, when I tested it on human samples (Wu et al., Nature. 2019), ecc_finder could identify potential breakpoints of eccDNA that spanned several Mb. However, the internal region between these often lacked sufficient valid reads so it was not bona fide and therefore it would fail in the step of peak calling.

For long eccDNA, I recommend using the information in the peak_files folder to track the breakpoints of long eccDNA with split and discordant reads.

split.bed (e.g., ecc.sr.split.bed) shows all potential split reads mapped on the genome, with read names in the 6th column. disc.bed (e.g.: ecc.sr.disc.bed) shows all potentially discordant reads mapped on the genome, with read names in the 6th column.

You can do bedtools intersect -a *split.bed -b *disc.bed and filter the size to 1-3 Mb. Be careful, it will contain many false positives so you need to filter for long eccDNA with at least 5 pair read supporting.

Best. Panpan