HannesStark / EquiBind

EquiBind: geometric deep learning for fast predictions of the 3D structure in which a small molecule binds to a protein
MIT License
469 stars 110 forks source link

Why use openbabel in data preprocessing #70

Closed tangmaomao16 closed 1 month ago

tangmaomao16 commented 1 month ago

I see the code file https://github.com/HannesStark/EquiBind/blob/main/data_preparation/openbabel_receptors.py ` import os import subprocess

import time

from tqdm import tqdm

start_time = time.time() data_path = 'data/PDBBind' overwrite = False names = sorted(os.listdir(data_path))

for i, name in tqdm(enumerate(names)): rec_path = os.path.join(data_path, name, f'{name}_protein.pdb') return_code = subprocess.run( f"obabel {rec_path} -O{os.path.join(data_path, name, f'{name}_protein_obabel.pdb')}", shell=True) print(return_code)

print("--- %s seconds ---" % (time.time() - start_time)) `

What I guess is that the authors use openbabel to convert the receptor protein pdb format file into also pdb format file. Why do this operation?

HannesStark commented 1 month ago

This step is unnecessary and should not be helpful.