This PR addresses issue #2238. Now, when users provide a collection name to anvi-estimate-metabolism, we only utilize gene calls that belong to the splits in the collection. And I also updated the Mode output when users provide both --metagenome-mode and a collection, so now it shows this:
Mode (what we are estimating metabolism for) .: Individual contigs within a collection in a metagenome
I tested it using the Infant Gut dataset and a collection including only a few splits:
head -n 30 additional-files/collections/merens.txt > meren_partial_collection.txt
anvi-import-collection -p PROFILE.db -C MEREN meren_partial_collection.txt -c CONTIGS.db
anvi-estimate-metabolism -c CONTIGS.db -p PROFILE.db -C MEREN --add-coverage -O test_coverage --metagenome-mode
It shows the following new warning when reducing the number of relevant splits to use, and it works with only 181 gene calls that belong to those splits:
WARNING
===============================================
Since a collection name was provided, we will only work with gene calls from the
subset of 30 splits in the collection for the purposes of estimating metabolism.
Gene calls from these sources ................: 181 found
* Since the --add-coverage flag was provided, we are now loading the relevant
coverage information from the provided profile database.
WARNING
===============================================
A subset of splits (30 of 4784, to be precise) are requested to initiate gene-
level coverage stats for. No need to worry, this is just a warning in case you
are as obsessed as wanting to know everything there is to know.
This PR addresses issue #2238. Now, when users provide a collection name to
anvi-estimate-metabolism
, we only utilize gene calls that belong to the splits in the collection. And I also updated the Mode output when users provide both--metagenome-mode
and a collection, so now it shows this:I tested it using the Infant Gut dataset and a collection including only a few splits:
It shows the following new warning when reducing the number of relevant splits to use, and it works with only 181 gene calls that belong to those splits:
I also tested these cases:
And I also tested it with
anvi-self-test --suite metabolism -T 6
to make sure nothing else broke from these changes (and everything was fine) :)