steineggerlab / foldseek

Foldseek enables fast and sensitive comparisons of large structure sets.
https://foldseek.com
GNU General Public License v3.0
695 stars 92 forks source link

easy-cluster with external db #237

Open xnought opened 5 months ago

xnought commented 5 months ago

Can easy-cluster take one of the external databases as input?

ie with one of these that you have listed

  Name                      Type        Taxonomy    Url
- Alphafold/UniProt     Aminoacid        yes    https://alphafold.ebi.ac.uk/
- Alphafold/UniProt50   Aminoacid        yes    https://alphafold.ebi.ac.uk/
- Alphafold/Proteome    Aminoacid        yes    https://alphafold.ebi.ac.uk/
- Alphafold/Swiss-Prot  Aminoacid        yes    https://alphafold.ebi.ac.uk/
- ESMAtlas30            Aminoacid          -    https://esmatlas.com
- PDB                   Aminoacid        yes    https://www.rcsb.org

or do I have to go through a different route? I'd greatly appreciate any direction here.

[!NOTE] When I gave easy-cluster a shot with the PDB (named pdb below) database I just get this

Screenshot 2024-01-31 at 5 57 00 PM

So I suspect those external databases are only for search? How wrong am I here?

milot-mirdita commented 4 months ago

The easy-cluster and easy-linclust workflows don't take databases as input. You have to use the cluster workflow.

This is a inconsistent since we added support for query database input to easy-search and not the clustering workflows. Sorry about that!

martin-steinegger commented 4 months ago

We will add this in future as support to easy-cluster

xnought commented 4 months ago

Awesome!