merenlab / anvio

An analysis and visualization platform for 'omics data
http://merenlab.org/software/anvio
GNU General Public License v3.0
432 stars 145 forks source link

Anvi_analyze_synteny ouput visualization #2281

Closed Qvdauwer closed 3 months ago

Qvdauwer commented 3 months ago

Dear,

I would like to look at synteny for a pangenome I constructed for different representatives of the Lactiplantibacillus plantarum species,

since I saw the function anvi-analyze synteny existed I wanted to give it a try, however I am unsure of what to do with the output of this tool

I ran it like this:

anvi-analyze-synteny -g LPLANT-FINAL-GENOMES.db -p LPLANT_PANGENOME_FINAL/Lplant-Final-Pangenome-PAN.db --ngram-window-range 2:3 -o Lplant_synteny_ngrams

with my own contig db and pangenome db constructed following the anvio pangenomics workflow and that work as intented, so I think they are fine

It gave me the following output (head of the file):

ngram   count   contig_db_name  N   number_of_loci
GC_00001011::GC_00001898    1   Lp_M1135    2   13
GC_00000877::GC_00001011    1   Lp_M1135    2   13
GC_00000877::GC_00001955    1   Lp_M1135    2   13
GC_00000415::GC_00001955    1   Lp_M1135    2   13
GC_00000415::GC_00000459    1   Lp_M1135    2   13
GC_00000320::GC_00000459    1   Lp_M1135    2   13
GC_00000320::GC_00000445    1   Lp_M1135    2   13
GC_00000331::GC_00000445    1   Lp_M1135    2   13
GC_00000331::GC_00001312    1   Lp_M1135    2   13
GC_00000150::GC_00001312    1   Lp_M1135    2   13
GC_00000150::GC_00001413    1   Lp_M1135    2   13
GC_00000481::GC_00001413    1   Lp_M1135    2   13
GC_00000481::GC_00001134    1   Lp_M1135    2   13
GC_00001134::GC_00001502    1   Lp_M1135    2   13
GC_00001502::GC_00003916    1   Lp_M1135    2   13
GC_00000478::GC_00003916    1   Lp_M1135    2   13
GC_00000068::GC_00000478    1   Lp_M1135    2   13
GC_00000068::GC_00001216    1   Lp_M1135    2   13
GC_00000520::GC_00001216    1   Lp_M1135    2   13
GC_00000107::GC_00000520    1   Lp_M1135    2   13
GC_00000107::GC_00000107    1   Lp_M1135    2   13
GC_00000096::GC_00000107    1   Lp_M1135    2   13
GC_00000096::GC_00000829    1   Lp_M1135    2   13
GC_00000829::GC_00001347    1   Lp_M1135    2   13
GC_00001347::GC_00001395    1   Lp_M1135    2   13
GC_00000240::GC_00001395    1   Lp_M1135    2   13
GC_00000240::GC_00001171    1   Lp_M1135    2   13
GC_00000007::GC_00001171    1   Lp_M1135    2   13
GC_00000007::GC_00001099    1   Lp_M1135    2   13
GC_00000948::GC_00001099    1   Lp_M1135    2   13

If I understand correctly it gives gene clusters that are connected together (as ngrams) within a given contig db. But I don't think I really understand what this is supposed to mean in referral to my genomes?

Also, do you have any information on how I could use this data to create a visual representation of the synteny within my pangenome? Because I struggle to find one

Thank you in advance for the help, Best regards, Quentin

meren commented 3 months ago

Dear Quentin,

This tool has been very useful for us to study the conservancy of gene clusters or functions in synteny across genomes, but it doesn't offer a lot of downstream visualization options in anvi'o. Your output shows the gene clusters that occur next to one another across genomes. More information is here: https://anvio.org/help/main/programs/anvi-analyze-synteny/

We are now developing a new strategy, but it will need another few months to be ready for prime time.

Best wishes,