Availability of Similarity Statistics Dataset

drorlab / combind

Integrated physics-based and ligand-based modeling.

Other

61 stars 13 forks source link

Availability of Similarity Statistics Dataset #5

Closed drewnutt closed 2 years ago

drewnutt commented 2 years ago

In the paper, you provide a summarized list of the structures and number of ligands used for calculating the similarity statistics used in the negative log-likelihood portion of the ComBind score (Supplementary Table 2). Do you have this dataset available for others to use or alternatively, do you have a list of the PDB IDs used in this dataset?

I believe this dataset would be required for me to determine the constant C for a new docking program when used in the ComBind pipeline.

jpaggi commented 2 years ago

I added stats_data/pdbs.txt, which lists the PDB IDs used when fitting the combind score and stats_data/structures.tar.gz, which contains the structures separated into protein and ligand. Note that all the remaining ligands are docked to the alphabetically first structure (and the self-docking case isn't considered), so the number of PDB IDs listed for each protein is one greater than the numbers in Supplementary Table 2.