soedinglab / MMseqs2

MMseqs2: ultra fast and sensitive search and clustering suite
https://mmseqs.com
GNU General Public License v3.0
1.4k stars 194 forks source link

Error using search module: databse type (Clustering) #876

Open PierreFoucault opened 2 months ago

PierreFoucault commented 2 months ago

Expected Behavior

searching against MMseqs2 databases after clustering step

Current Behavior

after mmseq2 db creation (giving a type nucleotide db) and clustering, it does not proceed due to formating issue (clustering)

I tried to create a new sub db using the index from the clustering DB (createsubdb clust_ORF_min100/dORF_seq_DB.index clust_ORF_min100/dORF_seq_DB clust_ORF_min100/dORF_rep_DB) but it still print the same error.

How can i change the mmseq2 cluster output to make it usable by the search function ?

MMseqs Output (for bugs)

Input database "clust_ORF_min100/dORF_rep_DB" has the wrong type (Clustering)

Context

Providing context helps us come up with a solution and improve our documentation for the future.

Your Environment

MMseqs Version: 15.6f452 conda environement on a server (linux distribution)

milot-mirdita commented 2 months ago

How did you create clust_ORF_min100/dORF_seq_DB?

PierreFoucault commented 2 months ago

Hello, I used the mmseqs linclust module linclust clust_ORF_min100/ORF_min100_seqDB clust_ORF_min100/dORF_seq_DB tmp_mmseqs --min-seq-id 0.95 -c 0.8 --cov-mode 1 -e 0.001 --threads 16

clust_ORF_min100/ORF_min100_seqDB is mmseqs2 DB (createdb function) created from a fasta file (from prodigal -meta)

milot-mirdita commented 2 months ago
mmseqs createsubdb clust_ORF_min100/dORF_seq_DB.index clust_ORF_min100/ORF_min100_seqDB clust_ORF_min100/dORF_rep_DB

Should be the correct command. The second parameter must be input sequence database, then the resulting subset-database becomes also a sequence database. You were passing the clustering database into createsubdb and it was creating a subset from this.