hkmztrk / DeepDTA

215 stars 107 forks source link

Question about KIBA score #29

Closed baehaeli closed 2 years ago

baehaeli commented 2 years ago

Hello, I am very interested in your work.

In the KIBA dataset, you use the KIBA score as suggested in 'Making Sense of Large-Scale Kinase Inhibitor Bioactivity Data Sets: A Comparative and Integrative Analysis'. (Tang et al, 2014)

It seems the KIBA score suggested in Tang's research and DeepDTA is different.

For example, the KIBA score between 'CHEMBL98350' and 'P48736' is reported as 3.21982 in Tang et al. However, the KIBA score of DeepDTA is not.

So I would like to know how you make the KIBA score detail, such as the parameter of Li and Ld when calculating the KIBA score.

eq1 eq2

Thank you

hkmztrk commented 2 years ago

Hi @baehaeli thanks for your interest!

We didn't do any changes on the original KIBA scores, but used the KIBA dataset provided in SimBoost (He et al., 2017) study. I will try to check this in the weekend.

baehaeli commented 2 years ago

Thank you for the reply.

I wonder if you have checked the reason for the different KIBA scores.

hkmztrk commented 2 years ago

Hi @baehaeli, thanks for reminding me, I totally forgot to respond. Again please see SimBoost (He et al., 2017) paper for details, they perform a transformation on KIBA as follows:

In an additional preprocessing step, we transformed the KIBA dataset by taking the negative of each value and adding the minimum to all values in order to obtain a threshold where all values above the threshold are classified as binding. The KIBA threshold of 3.0 in the untransformed dataset then becomes 12.1. (He et al., 2017)

In DeepDTA we directly use these outputs from He et al, without further transformation. Hope this helps.

baehaeli commented 2 years ago

Thank you!