AnantharamanLab / METABOLIC

A scalable high-throughput metabolic and biogeochemical functional trait profiler
172 stars 42 forks source link

Is it okay to use the assembled contig and metagenome reads as the input of the METABOLIC? #95

Open B-1991-ing opened 2 years ago

B-1991-ing commented 2 years ago

Hi Zhichao,

I got 55 MAGs from the 6 metagenome samples, but some typical genes for some MAGs were lost maybe during the binning or refinement process. So, is it okay that I use the assembled contigs and metagenome fq reads as the input of the METABOLIC to see the metabolic cycling process at the community level? Does it make sense to do it this way?

Best,

Bing

B-1991-ing commented 2 years ago

In my condition, contigs from the co-assembly are all in one fasta file for multiple metagenome samples, but contigs from the individual-assembly are all in each separate fasta file for each metagenome sample. I think it is possible to have the gtdbtk classification for each fasta file which contains many contigs?

Best,

Bing

ChaoLab commented 2 years ago

In my condition, contigs from the co-assembly are all in one fasta file for multiple metagenome samples, but contigs from the individual-assembly are all in each separate fasta file for each metagenome sample. I think it is possible to have the gtdbtk classification for each fasta file which contains many contigs?

Best,

Bing

The assembly from each metagenome sample contains sequences from many microbial genomes. GTDB-Tk will treat a single fast file as one genome. It will not give a meaningful taxonomy

haihao999 commented 1 year ago

Hi,zhichao After refining the genome,Is there a value that needs to be set for completeness and contamination ? e.g. completss>50% con<10%

ChaoLab commented 1 year ago

Yes, completeness >= 50%, contamination < 10% will be a minimal standard to refine a genome

haihao999 commented 1 year ago

Thank you ,zhichao I would like to confirm another question,Is it possible to analyze the genome based on the public database and the genome I got from binning after redundancy?

ChaoLab commented 1 year ago

METABOLIC will annotate genomes using multiple databases in a comprehensive manner. It is mainly based on using HMM search

haihao999 commented 1 year ago

hi,zhichao I mean, my genome is not obtained from my personal sample, rather it is a publicly available genome used to analyze my sample. Is this feasible? Thank you

ChaoLab commented 1 year ago

I am not so aware of what you are doing. If you are using some others' datasets, better to be sure on the rights to use the data