merenlab / anvio

An analysis and visualization platform for 'omics data
http://merenlab.org/software/anvio
GNU General Public License v3.0
439 stars 145 forks source link

Better utilizing and disseminating gene cluster homogeneity indices for pangenomics #979

Closed meren closed 6 years ago

meren commented 6 years ago

Thanks to @mahmoudyousef98's excellent addition to the codebase (see #977), the anvi'o pangenomic workflow has two new indices to investigate within gene cluster homogeneity.

There are at least two necessary steps going forward to make this powerful addition available for filtering gene clusters, and including some information about it in our documentation.

Documentation

We need to update our tutorial here:

http://merenlab.org/2016/11/08/pangenomics-v2/

Perhaps we can add a new section dedicated to homogeneity indices right after this one:

http://merenlab.org/2016/11/08/pangenomics-v2/#inspecting-gene-clusters

It will help others to better understand what those new layers are.

Implementing new filters

We currently allow users to filter their gene clusters based on various metrics. This is done through a function that is used both by the program anvi-get-sequences-for-gene-clusters (see the "ADVANCED FILTERS" section in the help menu), and through the user interface:

image

If a pangenome includes functional or geometric homogeneity indices in additional misc data tables, then we could also make available these filters both through the interface and the command line program. This way a query like this could be possible: "Give me all gene clusters that occurs in all genomes with a geometric homogeneity >= 1.0, and functional homogeneity of <1.0". This way a user can get all non-identical core gene clusters at once.

What do you think @mahmoudyousef98?

mahmoudyousef98 commented 6 years ago

Sounds good with me! I'll start working on it

meren commented 6 years ago

Self note: pangenomes without homogeneity indices break anvi-display-pan and we need to address this before the minor release.

meren commented 6 years ago

The documentation is updated, and the reminder above is fixed with #991 and following commits.