`Error in argument --min-seq-id` when using `easy-linclust` or `linclust` #682

Open taylorreiter opened 1 year ago

taylorreiter commented 1 year ago

Expected Behavior

mmseqs easy-linclust executes with flag --min-seq-id

Current Behavior

mmseqs easy-linclust throws error Error in argument --min-seq-id when given flag --min-seq-id

Steps to Reproduce (for bugs)

curl -JLO
mmseqs easy-linclust GCA_016584425.1_ASM1658442v1_translated_cds.faa.gz tmp_linclust tmp_mmseqs  --min-seq-id .9 -c .9 --similarity-type 2 --cov-mode 1

MMseqs Output (for bugs)

Error in argument --min-seq-id


I would like to use linclust to cluster sequences at 90 length/90% identity, but the flag --min-seq-id keeps throwing an error. is there a different way to specify sequence identity threshold for easy-linclust?

Your Environment

Include as many relevant details about the environment you experienced the bug in.

I reproduced the error on Ubuntu AWS EC2 as well with the same version of mmseqs2

milot-mirdita commented 1 year ago

We validate that (most) float parameters have a leading zero (or other number in-front of the .) if they are given in non-scientific notation, or alternatively in scientific-notation.

A similar issue was reported here:

I guess we could relax the validation step.

taylorreiter commented 1 year ago

Ah interesting, thank you for the explanation. Alternatively, having an informative error message that explains the failure would be helpful!

milot-mirdita commented 1 year ago

You are right, that's generally something we should have invested more time in. There is just so much to do...

taylorreiter commented 1 year ago

Absolutely :) well thank you very much for taking the time to respond to this issue! Will save me orders of magnitude of time being able to use linclust.