BioinfoMachineLearning / DIPS-Plus

The Enhanced Database of Interacting Protein Structures for Interface Prediction
https://zenodo.org/record/5134732
GNU General Public License v3.0
46 stars 8 forks source link

about make dips dataset #7

Closed XuBlack closed 3 years ago

XuBlack commented 3 years ago

When I run the command python3 project/datasets/builder/make_dataset.py project/datasets/DIPS/raw/pdb project/datasets/DIPS/interim --num_cpus 28 --source_type rcsb --bound, it will be run successfully. However, after running for a period of time, the program will be stuck and will not continue to run and no error will be reported. Have you encountered this problem before?

amorehead commented 3 years ago

@XuBlack,

I have observed similar behavior in this script before. My hypothesis is that one of the external tools the script calls during feature generation (e.g., MSMS) may be causing a deadlock of sorts after a certain number of complexes have been processed. However, since the script is designed to resume processing from where it left off, I have not yet investigated this issue in more detail. If I find out anything more about what may be causing this deadlock, I will let you know.

Thank you for bringing this to my attention!

XuBlack commented 3 years ago

Thank you for your replay! I will also further explore the cause of the deadlock.