soedinglab / MMseqs2-App

MMseqs2 app to run on your workstation or servers
https://search.foldseek.com
GNU General Public License v3.0
55 stars 18 forks source link

--expand-filter-clusters #25

Closed cmielke-vir closed 1 year ago

cmielke-vir commented 3 years ago

The MSA job pipeline built into this app has the following line in it :

EXPAND_PARAM="--expansion-mode 0 -e ${EXPAND_EVAL} --expand-filter-clusters ${FILTER} --max-seq-id 0.95" ... "${MMSEQS}" expandaln "${BASE}/qdb" "${DBBASE}/${DB1}.idx" "${BASE}/res" "${DBBASE}/${DB1}.idx" "${BASE}/res_exp" --db-load-mode 2 ${EXPAND_PARAM}

But it doesn't appear as though mmseqs expandaln recognizes the --expand-filter-clusters parameter. I can't find it anywhere in the source. Does this rely on an unpublished version of mmseqs?

Awesome work on ColabFold!

milot-mirdita commented 3 years ago

Yes this is part of my fork: https://github.com/milot-mirdita/mmseqs2

We are working on finishing the changes up and integrating them back into the main MMseqs2 repo.

cmielke-vir commented 3 years ago

Right on! Thanks for the pointer. This has gotten me much closer.

One last question : I'm having trouble understanding the provenance of the uniref30_2106.idx file. Ive noticed that mmseqs2 createindex can sometimes create .idx files, but do these fundamentally differ from the .index format found in other databases that mmseqs can download through its internal databases command?

Thanks again!

Edit : -----

I'm currently struggling with the folllowing error with the expandaln command. I suspect that this may be because the prepared .idx file is an incorrect database type. Ive also noticed that I can get .idx files from either the createindex command, or by using touchdb to preload the database. Are these idx filetypes fundamentally different?

Index version: 16 Generated by: e4aae9271f34a2e68e4d95a7356ef0bfe3ed9dc4 ScoreMatrix: VTML80.out Index version: 16 Generated by: e4aae9271f34a2e68e4d95a7356ef0bfe3ed9dc4 ScoreMatrix: VTML80.out Invalid database read for database data file=../../databases/UniRef50.idx, database index=../../databases/UniRef50.idx.index getData: local id (4294967295) >= db size (22)