Open jiajinlongkang opened 2 years ago
Hi Jack, Thank you for raising this issue, which I am also considering to include in a later version.
ecc_finder was originally tested in plants, where transposable elements that can produce eccDNA typically amount to less than 50 kb. However, when I tested it on human samples (Wu et al., Nature. 2019), ecc_finder could identify potential breakpoints of eccDNA that spanned several Mb. However, the internal region between these often lacked sufficient valid reads so it was not bona fide and therefore it would fail in the step of peak calling.
For long eccDNA, I recommend using the information in the peak_files
folder to track the breakpoints of long eccDNA with split and discordant reads.
split.bed (e.g., ecc.sr.split.bed) shows all potential split reads mapped on the genome, with read names in the 6th column. disc.bed (e.g.: ecc.sr.disc.bed) shows all potentially discordant reads mapped on the genome, with read names in the 6th column.
You can do bedtools intersect -a *split.bed -b *disc.bed
and filter the size to 1-3 Mb. Be careful, it will contain many false positives so you need to filter for long eccDNA with at least 5 pair read supporting.
Best. Panpan
Dear njaupan,
Is there an option in ecc_finder that allows users to specify the max length of eccDNA to detect? I wonder if I can use it to identify long eccDNA with the size of 1-3 Mb. Currently what I got are all below 150 kb.
Thanks, Jack