Open MelanieCHay opened 4 years ago
Back again, looks like there might be a mismatch between contig names and split names?
Is this an easy fix? I've looked for contig-mode and split-mode, which is an option for 'anvi-import-collection'. I am also wondering whether to just hack this by editing the tab-delimited file from bin-export and remove the split info and keep the contig info.
You did everything right, and that's why you're running into this issue :) Many apologies for this very confusing situation.
Usually people use contig names in files to merge with anvi-script-merge-collections
, and the requirement of a contigs database here is to find out the translation of those contig names to split names (so an anvi'o additional data table can be generated with split names).
You already have split names, so all is golden, but anvi'o is treating them as contig names to find out the corresponding split names in the database for each one of them.
To solve this, we need a --splits-mode
flag for anvi-script-merge-collections
script.
I will take a quick look and write back, @MelanieCHay.
Hi, were you able to deal with this issue? I think I am facing a similar problem. Thank you
I don't think we ended up implementing a --splits-mode
for this, @jimen210 . If possible, I would suggest using contig names for the script input (by simply removing the _split_xxxxx
part from the split names). The only reason I could imagine for not using the contig names is if some of your contigs are split across different bins (ie, one split from the contig is in one bin and another split from the same contig is in a different bin), but that usually means that the binning was done very poorly and I'm not sure it's worth keeping bins like that.
Hello,
Below are the results of anvi-self-test --version
Anvi'o version ...............................: esther (v6.1) Profile DB version ...........................: 31 Contigs DB version ...........................: 14 Pan DB version ...............................: 13 Genome data storage version ..................: 6 Auxiliary data storage version ...............: 2 Structure DB version .........................: 1
I am comparing binning methods on a large number of contigs. I'd like to add the collections as a layer and then do some 'consensus-binning' in anvi-interactive before refining.
I have exported my collections and now have txt files of contigs and bin names.
E.g. c_000000000001_split_00001 Bin_24 c_000000000003_split_00001 Bin_24 c_000000000006_split_00001 Bin_24 c_000000000009_split_00001 Bin_24 c_000000000029_split_00001 Bin_24 c_000000000078_split_00001 Bin_24 c_000000000082_split_00001 Bin_24 c_000000000087_split_00001 Bin_24 c_000000000091_split_00001 Bin_24 c_000000000095_split_00001 Bin_24
So looks good.
I have done this with concoct, metabat2, maxbin2, and dastool.
I then tried to merge the collections using: anvi-script-merge-collections -c CONTIGS.db \ -i additional-files/external-binning-results/*.txt \ -o collections.tsv
But I get an error. It looks like this.
~/data/sval-anvio$ anvi-script-merge-collections -c 03-contigs/sval_mg_contigs.db -i COLL-concoct.txt COLL-maxbin2.txt COLL-metabat2.txt COLL-dastool.txt -o binning_collections.tsv New Source ...................................: COLL-concoct, w/ 163249 contigs New Source ...................................: COLL-maxbin2, w/ 163249 contigs New Source ...................................: COLL-metabat2, w/ 163249 contigs New Source ...................................: COLL-dastool, w/ 163249 contigs Final number of unique contigs ...............: 163,249 Contigs DB ...................................: Initialized: 03-contigs/sval_mg_contigs.db (v. 14)
Config Error: Oh. You have the wrong stuff. Probably. Because, the contig 'c_000000061291_split_00001' does not match to any of the contig names in your database. Here is a random contig name you have in it in comparison: 'c_000000000001'.