steineggerlab / foldcomp

Compressing protein structures effectively with torsion angles
GNU General Public License v3.0
145 stars 14 forks source link

Can foldseek database be built from fcz files? #15

Closed yakomaxa closed 1 year ago

yakomaxa commented 1 year ago

I have a set of fcz files and made a custom FoldComp database by the protocol thankfully described here: https://github.com/steineggerlab/foldcomp/issues/14

Next, I would like to search the structures using FoldSeek. I guess one obvious solution would be just dumping the PDB/mmCIF files from FoldComp and converting them into FoldSeek database.

However, are there more direct ways to build FoldSeek database from FoldComp-related dataset? It could be nice if we can build FoldSeek database without de-compressing the FoldComp-related files.

Thank you again for excellent softwares!

khb7840 commented 1 year ago

Currently we don't have direct conversion features for Foldseek, but we have a plan to develop Foldseek to support Foldcomp databases (and fcz file format). As your suggestion, we would add some direct conversion features for various formats, including saving only Ca coordinates (current Foldseek DB). Thanks for your interests! We'll try to update Foldcomp with more features soon :)

yakomaxa commented 1 year ago

Thank you for explaining the current status and future plans. Although it's not a big problem to decompress fcz files (or whole FoldComp db) into pdb files because it's just temporary, it would be helpful if FoldSeek is made possible to directly read them.

Your softwares are so fascinating. I guess fcz format will be a standard format for large structure databases.

I'm looking forward to seeing further development of FoldComp and FoldSeek!

milot-mirdita commented 1 year ago

The latest foldseek git version contains code to directly read foldcomp databases. It's not very well tested yet, but we would appreciate feedback.

yakomaxa commented 1 year ago

Thank you for developing and announcing this function. I've successfully constructed a foldseek database from my custom foldcomp database. It works nicely at this moment.