merenlab / anvio

An analysis and visualization platform for 'omics data
http://merenlab.org/software/anvio
GNU General Public License v3.0
426 stars 145 forks source link

[FEATURE REQUEST] Expanding SCGs for anvi-estimate-scg-taxonomy #2095

Open mschecht opened 1 year ago

mschecht commented 1 year ago

The need

anvi-estimate-scg-taxonomy currently works with only the following Ribosomal proteins:

'Ribosomal_S2, Ribosomal_S3_C, Ribosomal_S6, Ribosomal_S7, Ribosomal_S8, Ribosomal_S9, Ribosomal_S11, Ribosomal_S20p, Ribosomal_L1, Ribosomal_L2, Ribosomal_L3, Ribosomal_L4, Ribosomal_L6, Ribosomal_L9_C,Ribosomal_L13, Ribosomal_L16, Ribosomal_L17, Ribosomal_L20, Ribosomal_L21p,Ribosomal_L22, ribosomal_L24, Ribosomal_L27A'

However, there are more SCGs than this! For example: Ribosomal_S17 and RNA_pol_B

The solution

Download and annotate the GTDB marker genes bac120_marker_genes_reps_r207.tar.gz with Bactera_71 to match up Pfam model with their marker genes. Then we can add them to here.

Beneficiaries

Those who wish to know who's there! Also, those who find our current marker genes are limiting their science.