Open Malabady opened 5 months ago
1) In the 4DTv density plot, I only see plots for the comparisons, but not the single species that i listed. Is this expected?
Very curiosome results! This must be species name parsing issues. Have you tried manually running the R script from here?
2) All plots have non-zero peaks at different 4DTv accumulation rates. I am a bit unclear about the meaning of it. Does it mean that there was one WGD event but the different 4DTv peaks are just related to the divergence time between the two species? From the WF script, I see that it uses the pairwise paralogs and pairwise orthologs of the compared species. For instance, I assume, the density plot for "mySpecies X Species A" include the 4DTV rates from the pairwise paralogs from mySpecies and SpeciesA plus the pairwise Orthologs between mySpecies and SpeciesA. And so on. that's why the plots have different peaks, but all refer to a single WGD events. Is this accurate understanding?
Yes this is what is being plotted refer to the same lines in the Rscript as above.
3) In the main output directory, there is a "*.4DTv" file for every species in the analysis. When I made a density plot for these files using the 4th column (assumed to be 4DTv rates) of each file, the plot is totally different from the plot generated by the pipeline. the files have the statistics for only pairwise genes, which is what the WF uses I assume. so, why the plots are different? in fact, my density plot of the 4DTv of mySpecies (attached), has only a zero-peak, which is different from what the pipeline produces in the comparisons.
This is unexpected. Let me know how manually running the Rscript goes for you.
Hello,
Thank you for the replies. As you suspected, there was a failure in parsing the "comparison" file because all single species lines had an extra tab at the end before the new line character. So, by fixing this issue, I can now get both the individual species and comparison plots. Using the R script directly, I got the following plot:
As you can, the individual species have a 4DTv rate peaks at "0" but the comparisons have peaks at various 4DTv rates. I am not how to interpret it. is this expected? what's the difference between the single species and comparison peaks? I understand the difference in the plotted data, but unclear about the interpretation.
Many thanks for your help,
Best,
You can think of the between species-pair 4DTvs as divergence times between their shared genomes. These comparisons are usually visualised/assessed when you believe the two species share a common ancestor, or one is hypothesised to be the parent of another - similar to my discussion point in the paper: "Arabidopsis suecica, an allopolyploid hybrid of A. thaliana and A. arenosa..."
Hello Jefferson,
My analysis includes my species of interest plus six other species. In my "comparisons_4DTv.txt", I listed single species as well as pairs of species, as follows:
`` mySpecies SpeciesA SpeciesB SpeciesC mySpecies X SpeciesA mySpecies X SpeciesB mySpecies X SpeciesC mySpecies X SpeciesD mySpecies X SpeciesE mySpecies X SpeciesF