greenelab / core-accessory-interactome

Investigating the functional relationship between P. aeruginosa core and accessory genes.
BSD 3-Clause "New" or "Revised" License
1 stars 1 forks source link

Update manuscript figures and fix distance calculation #52

Closed ajlee21 closed 2 years ago

ajlee21 commented 2 years ago

This PR updates the manuscript figures which includes the following changes:

  1. 7_stability_vs_genome_location.ipynb to calculate the distance accounting for the genome being circular. The results don't change after this update.
  2. Create a new figure 1 that describes the composition of the compendia: composition_of_compendia.ipynb
  3. Update the figures seen in figure_generation\output\. If you have any thoughts or suggestions to make the figures more clear please feel free to let me know. Note: Figure 1 needs to be cleaned up so that there are not as many legends and redundant colors, but need to help from collaborators on how to consolidate the legend labels.
ajlee21 commented 2 years ago

In figure 2, it's somewhat unclear what "% neighboring homologs that match" means. What do they match? It might be clearer if the sets in the Venn diagram were labeled

For the '% homologs matched' this is a measure of distance. So we're asking if a least core gene (i.e. target gene) is located in the same location in PAO1 and PA14. Since the genomes are directly mappable, the way we did this was to look at the neighboring genes of the target gene and compare the overlap of the neighbors in PAO1 and neighbors in PA14. If the overlap is high, meaning that many of the neighbors are the same in PAO1 and PA14 then this would indicate that the gene is located in the same location.

As discussed offline, I'll add a description in the legend so its clear to readers

I really like the color-coded words in figure 3 I'm glad its more clear

Panel C in figure 3 could use a title that gives context for what exoS and exoU are I can definitely add this so its more clear.

Since I'm re-running the analysis notebooks and so the results for the figures will be updated, I'll add these changes to the next PR