steineggerlab / foldseek

Foldseek enables fast and sensitive comparisons of large structure sets.
https://foldseek.com
GNU General Public License v3.0
780 stars 99 forks source link

Easy-Cluster CLI sensitivity flag documentation bug? #229

Open danny305 opened 8 months ago

danny305 commented 8 months ago

When you run foldseek easy-search -h for the sensitivity flag it reads: -s FLOAT Sensitivity: 1.0 faster; 4.0 fast; 7.5 sensitive [9.500]

It seem like the range is between 1.0 to 7.5 (similar to mmseqs2). However, the default is set to 9.5? When I run it indeeds sets the sensitivity parameter to 9.5.

It would be great to get some clarification on whether or not this is a bug.

Danny

milot-mirdita commented 8 months ago

We didn't update the description string coming from MMseqs2. 9.5 is the intended sensitivity to be used with Foldseek.

danny305 commented 8 months ago

Okay so what is the max sensitivity value? 10?

In mmseqs2, 5.7 is the default. In foldseek what would be the equivalent sensitivity value (approximately)? Seems like 5.7 maps to 9.5? Is this a fair assumption?

Thank you for getting back to me so quickly. I'm trying to finish my revisions for my manuscript and would like to incorporate foldseek!

Danny

milot-mirdita commented 8 months ago

The default sensitivity of Foldseek is set much higher than the default MMseqs2 sensitivity.

MMseqs2's 7.5 and Foldseek's 9.5 should be about the maximum intended sensitivity to set, going higher might actually reduce real sensitivity as it enormously large k-mer similarity lists, might hurt performance.

danny305 commented 8 months ago

Okay thank you for making that clear!

I've been playing with it the last few days and it's a really great tool!

Already have several ideas on how to incorporate it for the next round of projects here at DeepProteins@IFML.