Closed phiweger closed 1 year ago
These sequences should be stored in the database with the ending _ss. You can convert it to a fasta file by using
foldseek convert2fasta db_ss db_ss.fasta
This gives an error at the moment. The workaround is to
mv db tmp
cp db_ss db
foldseek convert2fasta db db_ss.fasta
mv tmp db
I get the following error:
foldseek convert2fasta queryDB_ss queryDB_ss.fasta
convert2fasta queryDB_ss queryDB_ss.fasta
MMseqs Version: a4983ce31e6e006a29d9d9330ce9f826cd555d3e
Use header DB false
Verbosity 3
Database queryDB_ss needs header information
A better workaround should be:
foldseek lndb queryDB_h queryDB_ss_h
foldseek convert2fasta queryDB_ss queryDB_ss.fasta
Note though that we might not keep the state to alphabet letter assignments stable between releases.
Hi! I am running into some issues when trying to find the _ss file. When I run the easy-search command, I believe the command deletes this database. Could you advise me on a different command to run in order to receive this database as an output? I am solely interested in converting pdb files to 3Di sequences.
You can just use foldseek createdb pdbFolder outputDb to generate the _ss files.
Thank you so much! It works! This was very helpful!
Is it possible to reverse this, ie, from the states generate the 3D structure? As VAEs are generative models, "something" should come out when you decode the states. Is this possible through the current API?
At least for the foldseek databases we also store the C-alpha coordinates and also implement PULCHRA within foldseek. So you can get a reasonable backbone back from each foldseek database entry.
I don’t think we have looked into getting a structure back out of the 3Di states yet though.
We have recently added the --format-mode 5
option to our software, which generates PDB files with all Calpha atoms superimposed based on the aligned coordinates.
If I understand correctly, the VQ-VAE used by foldseek translates each amino acid into one of 20 "states". Do we have access to these, i.e. is it possible to get the "state sequence"? Like:
AVGAI -> states 1, 5, 7, 1, 13
Thanks!