Closed cifuj closed 5 months ago
Hi. Sorry for the slow reply. This is a bit surprising, but can happen as CheckM is not deterministic. CheckM places your genome into a reference tree using pplacer
. This placement can change slightly each time CheckM is run. In general pplacer
is very stable, but it can happen.
Hi @dparks1134,
I have an issue with
checkm lineage_wf
. I have a set of bins that are annotated differently when I run them together or separately. I am usingcheckm lineage_wf -x fasta -t 5 --file "test" ./ results
. I only change the bins in the folder to obtain these results.When I run them together, I obtain
Bin Id Marker lineage # genomes # markers # marker sets 0 1 2 3 4 5+ Completeness Contamination Strain heterogeneity
bin_5354_H_all_kfilt3_MH_21_141.1 kBacteria (UID2495) 2993 143 89 6 136 1 0 0 0 94.38 1.12 0.00 5354_J_MH_concont_bin.163 cSpirochaetia (UID2496) 72 215 125 10 205 0 0 0 0 93.60 0.00 0.00 bin_5354_R_all_kfilt3_MH_21_141.4 kBacteria (UID2495) 2993 143 89 7 132 4 0 0 0 93.26 3.56 50.00 5354_X_MH_concont_bin.117 cSpirochaetia (UID2496) 72 215 125 12 203 0 0 0 0 92.80 0.00 0.00
But when I run bin_5354_H_all_kfilt3_MH_21_141.1 alone, I obtain a different Marker lineage and then completeness and contamination
Bin Id Marker lineage # genomes # markers # marker sets 0 1 2 3 4 5+ Completeness Contamination Strain heterogeneity
bin_5354_H_all_kfilt3_MH_21_141.1 c__Spirochaetia (UID2496) 72 215 125 18 197 0 0 0 0 88.00 0.00 0.00
The lineage.ms file is also different for both runs. bin_5354_H_all_kfilt3_MH_21_141.1 10 UID2497 o__Spirochaetales 71 bin_5354_H_all_kfilt3_MH_21_141.1 11 UID2502 o__Spirochaetales 66
I installed checkm (CheckM v1.2.1) again using conda today and reinstalled the checkm database.