Open Ashranzzler opened 2 years ago
Hi, thank you for the question.
To obtain PDB contacts from PDB files pydca can be used. For example,
from pydca.contact_visualizer.contact_visualizer import DCAVisualizer
pdb_file = 'path_to_the_pdb_file'
refseq_file = 'path_to_the_refseq_file'
chain_id = 'pdb_chain_id'
dca_vis_instance = DCAVisualizer(
biomolecule='RNA',
pdb_file = pdb_file,
pdb_chain_id= chain_id,
refseq_file = refseq_file,
)
mapped_pairs, missing_pairs = dca_vis_instance.get_mapped_pdb_contacts()
mapped_pairs is a dictionary whose keys are site pairs (index starting from zero) and values are tuples of atom pair names and distances.
Using a contact definition e.g., 10 Angstrom, and nucleotides that are at least 4 sites apart in the sequence, contacts are filtered as
contacts = [
site_pair for site_pair in mapped_pairs if mapped_pairs[site_pair][-1] < 10.0 and site_pair[1] - site_pair[0] > 4
]
Best, Mehari
HI, I would like to ask how do I get groundtruth contact from the PDB file in your provided dataset. Thanks.