Closed Leo152 closed 2 years ago
Hi Leo, thank you for your suggestion. I added descriptions of the output files in README. Hope this would be helpful.
Thank you very much. I'm looking at README. Another question. I saw this in "eccDNAs are Apoptotic products with high Innate immunostimulatory activity." The article said that the code for analysis could be obtained in GitHub, but I did not find it. May I ask whether this code can be made public? Can you provide a copy if you can make it public?
Thanks!
Hi Leo, all the code are provided in this repo. The code used for reads mapping is available at:
https://github.com/icebert/eccDNA_RCA_nanopore/tree/main/mapping
The code for analyzing eccDNA from Nanopore mapped reads is available at:
https://github.com/icebert/eccDNA_RCA_nanopore/blob/main/eccDNA_RCA_nanopore.py
Sorry to bother you, thank you very much for your reply
Hello, sorry to bother you again. What is the specific processing process and code of removing PCR duplicates, bedtools and karyoploteR in this section?
For these analysis, I used commend line as described in the methods (your screenshot) with the tools mentioned. I don't have a script for these analysis.
Regarding the analysis of this paragraph, I should use the bam file output by minimap2 and not using the eccDNA_RCA_nanopore output file, right?
All the analysis are based on the eccDNA_RCA_nanopore output. The minimap2 output is just intermediate file. The eccDNA_RCA_nanopore output is the final results. Each line is an eccDNA identified (we only use Nfullpass>=2). For the genomic distribution, you only need the fragments
column in the info file generated by eccDNA_RCA_nanopore. You can convert it to bed format then using uniq
command to remove duplicates. Then bedtools genomecov -i <bed> -d -bg -g <chr.size>
was used to gernerate coverage in bedgraph format and converted to bigwig format with bedGraphToBigWig
. Then you can plot eccDNA distribution using karyoploteR with the bigwig file as input.
I see. Thank you for your reply
Hello, excuse me. In the info file, multiple reads like the one in the picture are all in one fragment, so they can be regarded as the same eccDNA molecule, right?
If I want to know how many eccDNA molecules there are in a sample, can I count them after fragment removal and repetition?
Yes, these are probably PCR duplicates
Is there a better duplicate removal software? I'm using uniq now, but it doesn't remove fragments that are several bp apart
I also used the 'uniq'. Currently almost all PCR removal tools including 'picard markduplicates' are solely based on coordinates.
May I ask if I understand this correctly? If the fragment is exactly the same, it is PCR repetition. If the fragment differs by a certain bp, it may be one of the reads that make up eccDNA .
Yes, your understanding is right.
I would like to ask whether the "Unique eccDNA" in this table was only removed from the PCR repeats, or not only the PCR repeats but also those fragments with a certain bp difference.
In my understanding, those fragments with a certain bp difference are the same eccDNA and should be removed, but they are not exactly the same and cannot be removed by software. The amount of data is also large and cannot be removed artificially, so I don't know how to deal with them
In this table, we used the 'uniq' to get unique eccDNAs. So only fragments with the same positions were treated as PCR duplicates.
May I ask if the reference genome "MM10Combine" in your analysis can be downloaded? I only found the experimental data on the Internet, and I would like to compare them to see which step of my analysis has problems.
The mm10combine reference genome can be downloaded at: https://figshare.com/ndownloader/files/31960676
The example dataset can be downloaded at: https://figshare.com/ndownloader/files/31526759
The dataset of the original paper can be downloaded at GEO: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSM5058061
Many thanks
I would like to ask you a question. In the INFO file, unless the fragment is repeated, we think that a line is one eccDNA. However, after PCR amplification, the majority of eccDNA should be composed of multiple reads as shown in the following figure. But why is so much eccDNA composed of just one reads? Or why is it that most eccDNA has only one line in the INFO file and there is no fragment duplication?
We used Rolling Circle Amplification, which is totally different with PCR. In PCR, one fragments is amplified with multiple cycles, but RCA doesn't. And given that we have demonstrated that eccDNAs generates from apoptotic DNA fragments, so it origins quite randomly from anywhere of the genome. So it is normal that many eccDNAs only have one read.
In addition, you should only use the line with Nfullpass>=2
Hi, Excuse me, what are the formats of the three files output by eccDNA_RCA_nanopore?
Are there any output instructions?
thanks!