steineggerlab / foldseek

Foldseek enables fast and sensitive comparisons of large structure sets.
https://foldseek.com
GNU General Public License v3.0
696 stars 92 forks source link

get predicted structure from amino acid sequence as input in command line version #188

Open Jigyasa3 opened 9 months ago

Jigyasa3 commented 9 months ago

Hi @martin-steinegger ,

Thank you for the awesome software for structure-based clustering and sequence search. I was wondering if its possible to run the easy-search or easy-clustercommands with protein sequences as input?

martin-steinegger commented 9 months ago

We do not support running foldseek with amino acids at the moment. For that, foldseek would need to predict the structures, which we believe is currently outside the scope of the software. However, @milot-mirdita has developed a feature for our webserver based on the language model ProstT5. I believe ProstT5 allows fast inference of 3Di from an amino acid sequence. So you could build the workflow based on ProstT5 and foldseek.

pranavathiyani commented 3 months ago

Hi @martin-steinegger & @milot-mirdita

I have a single file with multiple 3Di sequences predicted from ProstT5, can you suggest how I can build a workflow to search against a custom build database? Also, I would like to know how to filter the final result in the output table similar to foldseek server's output.