steineggerlab / foldseek

Foldseek enables fast and sensitive comparisons of large structure sets.
https://foldseek.com
GNU General Public License v3.0
834 stars 103 forks source link

Guidance to search small fragments (8-mer) #368

Open alexproteomics opened 1 month ago

alexproteomics commented 1 month ago

Hi, I'm new to foldseek and I wonder if you can you provide guidance on how I can obtain structure alignment of a small fragment of a protein. Basically, the protein has a 8-aa forming a beta-hairpin. I want to align this against the entire alphafold swissprot. I tried the default easy-search but nothing came back in my output file. I understand that a small fragment will align to too many sequences and maybe nothing is passing the minimum thresholds. I'm ok with having false positives since I have other means to filter the table downstream, but since I'm not getting anything back from my easy-search, I guess I need to change some parameters to allow the low confidence hits to be written to my output file. Again, I would appreciate some guidance on how to address this. thanks Alex

milot-mirdita commented 1 month ago

Please try --prefilter-mode 1. This is a different prefiltering algorithm that doesn't have a minimum length for matches. The k-mer prefilter cannot match anything shorter than ~13aa in its default settings.

Other parameters that you might need to change are the E-value threshold -e inf and minimum ungapped score --min-ungapped-score 5 (15 is default, maybe you'll need to run an even lower score).

alexproteomics commented 1 month ago

@milot-mirdita Thank you very much for your help