wengong-jin / DSMBind

MIT License
73 stars 9 forks source link

Source of Mutant Structures for Skempi + PDB preprocessing? #6

Open kborisiak opened 7 months ago

kborisiak commented 7 months ago

Hi! I am trying to use the build_skempi.py script to process some similar data for testing, and I have a few questions.

  1. It looks like the build_skempi.py script expects there to be mutant structures to run on, but skempi does not not provide mutant structures. How were these mutant structures obtained? Can you point me to where I can find them or how I should generate them (PyRosetta, FoldX etc.), starting from wild type PDB's?
  2. When trying to run the build script on new PDB's, I run into a reshaping error on this line of code
    def process(tup):
    pdb, achain, bchain = tup
    _, acoords, aseq, _, _ = get_seq_coords_and_angles(achain)
    _, bcoords, bseq, _, _ = get_seq_coords_and_angles(bchain)
    acoords = acoords.reshape((-1,14,3))
    bcoords = bcoords.reshape((-1,14,3))
    return (pdb, aseq, acoords, bseq, bcoords)

    (cannot reshape array of size 12330 into shape (14,3)). If I instead change this line to (-1,15,3), the processing script runs, but the model expects coordinates in groups of 14. Do I need to preprocess my PDB's first? Thanks!

wengong-jin commented 7 months ago

Thank you for reaching out.

For your first question, I used FoldX to generate mutant structures and I have uploaded these PDBs to zenodo. https://zenodo.org/records/10582261

To answer your second question, the sidechainnet has recently been upgraded and they changed their sidechain atom dimension from 14 to 15. I would recommend using an older version of sidechainnet (e.g., v0.7.6)

Thanks

kruus commented 1 month ago

data/recA/ for the ligand virtual screening example seems unavailable.