chainer / chainer-chemistry

Chainer Chemistry: A Library for Deep Learning in Biology and Chemistry
MIT License
622 stars 129 forks source link

Why does evaluation of tox21 models use not test dataset but validation dataset? #335

Closed tamuhey closed 5 years ago

tamuhey commented 5 years ago

Thanks for this cool library!

The tox21 model is trained with "train" and "val" dataset defined in the following line:

https://github.com/pfnet-research/chainer-chemistry/blob/bc90b1b0e5019e7676e4973965f87f09b4987d2c/examples/tox21/data.py#L81

And the model is evaluated with "val" dataset, not "test", as follows:

https://github.com/pfnet-research/chainer-chemistry/blob/bc90b1b0e5019e7676e4973965f87f09b4987d2c/examples/tox21/predict_tox21_with_classifier.py#L50

Why "test" dataset is not used?

corochann commented 5 years ago

Thank you for your comment. I remember test data actually does not contain "label" information. I guess this is tox21 contest dataset and test dataset is really a "test" dataset and its label is blind.

So we used validation data for checking model's accuracy in the example.

tamuhey commented 5 years ago

Thanks for reply!

Test label is here: http://bioinf.jku.at/research/DeepTox/tox21.html

corochann commented 5 years ago

Thank you for info. You can also refer molnet dataset, which also contains tox21 data. Sorry I did not check the test label is contained or not in molnet.