Closed microDM closed 3 years ago
Hi @microDM,
This message here:
Config Error: default is not a valid collection ID. See a list of available ones with '--list-
collections' flag
is for the program anvi-split
, not for the program anvi-export-table
. So I am not sure what was your intention to run this:
anvi-export-table Burkh_Pan-PAN.db -l
If you truly intend to split your pangenome into CORE, ACCESSORY and UNIQUE gene clusters, you should first create a collection that contain those gene clusters, and store it in the pan database (the name you give to your collection will be the name you will need when you run anvi-split
with the -C
parameter later). You can do this through anvi-display-pan
interactively, or through the command line using anvi-import-collection
after identifying which gene clusters are core, accessory, or singleton (you can use anvi-summarize
on your pangenome to get that information).
Best,
Got it.
I used anvi-export-table
to export "gene_cluster_presence_absence". Then marked CORE, ACCESSORY and UNIQUE clusters.
Then imported collection using anvi-import-collection
.
Then split my pangenome using anvi-split
You are a hacker, @microDM :) Great job.
Short description of the problem
I have followed the tutorial at https://merenlab.org/2016/11/08/pangenomics-v2/ Using anvi-pan-genome I have created PAN-GENOME.db of 143 complete genomes. Now I want to split the PAN_GENOME into CORE, ACCESSORY and UNIQUE genes.
anvi'o version
Anvi'o .......................................: hope (v7-dev)
Profile database .............................: 35 Contigs database .............................: 20 Pan database .................................: 14 Genome data storage ..........................: 7 Auxiliary data storage .......................: 2 Structure database ...........................: 2 Metabolic modules database ...................: 2 tRNA-seq database ............................: 1
System info
I am using Ubuntu 20.10. I installed anvio using miniconda.
Detailed description of the issue
After using anvi-split using default option I got following output:
anvi-split -p Burkh_Pan/Burkh_Pan-PAN.db -C default -g Burk-GENOMES.db -o temp-split
Functions found .............................................: COG20_FUNCTION, COG20_CATEGORY, COG20_PATHWAY
Genomes storage .............................................: Initialized (storage hash: hash1a269245)
Num genomes in storage ......................................: 143 Num genomes will be used ....................................: 143 Pan DB ......................................................: Initialized: Burkh_Pan/Burkh_Pan-PAN.db (v. 14) Gene cluster homogeneity estimates ..........................: Functional: [YES]; Geometric: [YES]; Combined: [YES]
Config Error: default is not a valid collection ID. See a list of available ones with '--list- collections' flag
anvi-export-table Burkh_Pan-PAN.db -l
self gene_clusters item_additional_data item_orders layer_additional_data layer_orders views collections_info collections_bins_info collections_of_contigs collections_of_splits states gene_cluster_frequencies gene_cluster_presence_absence
Where all collections are empty dataframe. How can I extract CORE, ACCESSORY and UNIQUE genes from PAN-GENOME database.?
Files to reproduce
If you have no files you can share with us to reproduce the issue, please remove this text and header completely.
If you have files (i.e., a contigs database, a profile database, a BAM file, etc), please put them in a single directory, compress the directory, upload it to Dropbox and share with us a download link here along with instructions on how to reproduce the error.