Open brymerr921 opened 6 years ago
Hi Bryan,
These are awesome ideas and we would like to implement, and the sketch explains it very well. We already improved many things about pangenomics workflow for v4
which we will release soon, including some changes in inspect page (color outputs, popups for detailed information about gene caller), but I think adding genomic context will make it much better. I will work on this after releasing this version, hopefully, we can have these features ready for v5
. I will comment here once we have a prototype on master
.
All the best, Ozcan
Ozcan,
Thanks, that is wonderful news! I'll be first in line to test it and provide feedback when it rolls out.
Some other thoughts bouncing around my head about this are:
Trying to be unambiguous about what these "units of several genes" actually are. I've often heard a set of genes located near each other on a chromosome be called "gene clusters" before, but clearly a different term is needed in the context of Anvi'o.
A specific way to annotate genes that belong in a defined functional cluster. At present, I can hack this by feeding a file to anvi-import-functions
where genes belonging to the same functional cluster are annotated with the same, unique identifier (e.g. gene_cluster_1)
Synteny-aware gene/protein clustering. Based on some user-defined parameters, it may be interesting to determine (while running anvi-pan-genome or after as some sort of a filter) whether any genes in the Anvi'o gene (protein) cluster should be kicked out because the genomic context is different. For example, for protein A and B which are members of the same protein cluster (high % identity, etc.) they must pass the minbit, etc. thresholds, but also need to have n genes in the same protein cluster within m genes of protein A and Protein B.
Best, Bryan
Hi, @ozcan, I was wondering if there are any updates on this front. Thanks!
Hey Bryan,
We are finalizing v5, and clearly this feature will be for another spring :/ You will see from the release notes it was a very busy period for anvi'o developers and many outstanding features are waiting to be implemented :/
Sorry.
Hi Anvi'o developers,
I've been using the pangenomic workflow a lot, but (1) knowing the genomic context of the genes that are a part of each protein cluster and (2) having quick access to the functions of genes in a protein cluster would be super useful. For instance, proteins that cluster together might look similar at the amino acid level but be involved in different bacterial pathways if the surrounding genes are different. This visualization will support analyses of bacterial pathways instead of just genes by themselves.
Here's a drawing that shows what I would find extremely helpful. I think it'd make sense as an additional part of the page that appears when I right click on a protein cluster in
anvi-display-pan
and choose "Inspect", or perhaps as a separate menu option after right-clicking.The main ideas are:
anvi-interactive
andanvi-refine
.anvi-interactive
oranvi-refine
that shows information about the gene call and its annotations.As always, thanks for making Anvi'o great!
Bryan