Question of the output - Githubissues

psj1997 commented 4 years ago

Hi I have tried to use the AMBER, but I am confused about the output that there are Average completeness (bp) and CAMI 1 average completeness (bp). What is the difference between them?Thanks for your suggestions!

fernandomeyer commented 4 years ago

Hello, the CAMI 1 average completeness is computed as in the CAMI 1 challenge (Sczyrba et al. Nature Methods 2017) and in AMBER v1 (Meyer et al. GigaScience 2018).

The average completeness (without the CAMI 1) is computed, for each genome, from the predicted bin containing the largest number of base pairs (bp) of the genome. It is the average of the number of bp (or contigs) of the genome in that bin divided by the genome size (in bp or contigs). In other words, the bins with the highest completeness per genome are considered. In the CAMI 1 way, the bins in which each genome is the most abundant in bp (compared to the other genomes in the same bin) are considered (see details in the references above).

Note that you also find a definition of the metrics by moving the mouse pointer over a metric in the HTML output.

psj1997 commented 4 years ago

Thanks!

jiaojiaoguan commented 3 weeks ago

Hello, the CAMI 1 average completeness is computed as in the CAMI 1 challenge (Sczyrba et al. Nature Methods 2017) and in AMBER v1 (Meyer et al. GigaScience 2018).

The average completeness (without the CAMI 1) is computed, for each genome, from the predicted bin containing the largest number of base pairs (bp) of the genome. It is the average of the number of bp (or contigs) of the genome in that bin divided by the genome size (in bp or contigs). In other words, the bins with the highest completeness per genome are considered. In the CAMI 1 way, the bins in which each genome is the most abundant in bp (compared to the other genomes in the same bin) are considered (see details in the references above).

Note that you also find a definition of the metrics by moving the mouse pointer over a metric in the HTML output.

hello, Thanks for your response. I just wanted to check with you and see if my understanding is correct. For example, there is a bin named bin1. It includes two contigs,contig1 and contig2. The information is below:

Contig length genome genome_length contig1 10bp g1 50bp contig2 100bp g2 1000bp

The contig1's length is 10bp and the contig2's length is 100bp. The "g" represents the genome. the g1's total length is 50bp and the g2's total length is 1000bp.

In cami1, we will assign g2 into bin1 since b1 the most abundant is contig2, which belongs to g2. But the completeness of the g1 is 10/50 and the completeness of the g2 is 100/1000. If we assign the genome by the completeness, the highest completeness is g1. Thus now the genome label of the bin is g1.

CAMI-challenge / AMBER

Question of the output #40