steineggerlab / foldseek

Foldseek enables fast and sensitive comparisons of large structure sets.
https://foldseek.com
GNU General Public License v3.0
693 stars 91 forks source link

pre-trained 3Di embedding #252

Closed chooyu1998 closed 3 months ago

chooyu1998 commented 3 months ago

I understand that the VQ-VAE used by foldseek translates each amino acid into one of 20 state. (before: VAL amino acid after: ALA amino acid) so is there a way to get pre-trained 3Di embedding before translating the 20 states?

milot-mirdita commented 3 months ago

A community member has ported the VQ-VAE to numpy: https://github.com/althonos/mini3di

This should be the easiest way to play with the individual NN layers.

You might also want to consider ProstT5 for structural embeddings. Generating 3Di from ProstT5 works very well.

chooyu1998 commented 3 months ago

Thanks for your quick response!