Open LLehner opened 3 days ago
It is the sequence identity of the amino acids sequence based on the structural alignment.
Thank you for your quick response and clarification!
Just one more question:
With foldseek we get many protein pairs below 25% sequence identity, while with mmseqs2 there are barely any below 25% sequence identity. Since we use the exact same dataset (and identical parameters where possible), could this mean foldseek is better at detecting pairs of evolutionary distant (diverged) proteins, where just some structurally relevant domains are conserved? The proteins in question have an ungapped alignment length of ~50-500.
Hello, thank you for this great tool.
Currently we are trying to redo the HFSP curve using foldseek instead of mmseqs2. When using identical data we noticed a shift downwards in % sequence identity using foldseek compared to mmseqs2.
My question is: Is 'fident' reported by foldseek based on the AA residues in protein sequences (like mmseqs2) or is it based on the new structural alphabet of 3Di states?
Edit: we noticed a downwards shift in foldseek, not upwards, in terms of % sequence identity