gnina / models

Trained caffe models
82 stars 23 forks source link

What dataset for the built-in models? #36

Open gaoshan2006 opened 11 months ago

gaoshan2006 commented 11 months ago

Hi Developer,

I see there are 5 default built-in models, including "redock_default2018_2","general_default2018_3", "crossdock_default2018" and 2 "Dense" models. It looks the redock model was built from the "redock" subset in Crossdock2020 dataset. But for "general_default2018" , I originally guessed this model was built from PDbbind2016 General dataset, I ever tried to build my own default2018 model by using PDBbind2016 General only, however the performance of my own default2018 model from PDBbind2016 General is much poor than the "general_default2018_3" when affinity evaluation ( just single model comparing), So I guess maybe there was more data used. What the dataset was used to build the "general_default2018" ?

Another question, I guess the "crossdock_default2018" and "Dense" models are both built from Crossdock2020 dataset, right ? There are several serial files with "CrossDocked2020/types" folder, like "it2_tt_v1.3_0_train" ,"it2_tt_v1.3_10p20n_train" and "mod_it2_tt_v1.3_0_train" files. Which serial types file was used for "crossdock_default2018" and "Dense" models ?

Thanks a lot !

dkoes commented 10 months ago

general means PDBbind general, but is using docked poses, not crystal. The deployed models train on the full dataset and used CrossDocked 1.0 (1.3 improves performance).