Closed alquraishi closed 7 years ago
Hi Mohammed,
I think I know whats happening and I can fix it tomorrow.
As a short-term workaround until then: If you explicitly pass MMseqs a sensitivity, it won't automatically try to use the cascaded workflow, where this issue is (most likely) happening. You want the cascaded clustering for a larger database though.
Best regards, Milot
Thanks for the tip Milot. Yes you're right, specifying a sensitivity does make it run. It's a bit strange because cascaded clustering is off by default (at least according to the help), so I'm not sure why it's using it without me explicitly asking for it. At any rate, I look forward to the fix!
Hi Mohammed, I removed the --target-cov parameter and replaced it with a new parameter --cov-mode 1.
The documentation was also updated: https://github.com/soedinglab/MMseqs2/wiki#how-to-set-the-right-alignment-coverage-to-cluster
Why its automatically choosing the cascaded clustering: We implemented some heuristics to automatically choose the best parameters originally for the clustering. Maybe we have to rethink that decision and remove those.
Best regards, Milot
Thank you Milot! It appears to be working now.
Based on the latest version from Master, it appears that the
--target-cov
option is not currently working with mmseqs cluster. Specifically, something like:mmseqs cluster DB DB_clu tmp --min-seq-id 0.4 --target-cov 0.8 --cluster-mode 2
returns a number of errors. I think what's happening is that internally, the options
-c 0
and--target-cov 0.8
are being based passed to other commands (e.g. linclust), and they're failing because they're not expecting both options at once. Note that I am not passing-c 0
to the main command, but internally it's generated and passed to other commands.