Closed sadiogo closed 11 months ago
There are more modern alternatives for this that we prioritise.
Totally understandable, I like Foldseek (which was also mentioned by another user). Just thought DASH would be an easy fast alternative to implement while better ones are still in development.
DASH (Database of Aligned Structural Homologs) is a database of structural alignments for all known structurally homologous protein domains and chains in the PDB.
They provide a fasta database for protein domains in the PDB, wherein all domains are part of a structural cluster. This information is present in the fasta header, so all that needs to be done is parse this information once the blast algorithm returns a hit. In this case, every hit will be linked to a cluster, therefore all proteins from that cluster are suitable for extracting ligands.
The only caveat is that they haven't updated the database since 2020, but I'm sure they would be willing to collaborate with alphafill.
You can download the domain fasta database here. There is also plenty of information on how to use their API.