prediction values changing with a number of input molecules

sleeper2173 commented 5 years ago

I am trying to predict molecular properties using a pre-traind model. Howeber, I find that the predicted values are changing with a different number of input molecules and a different batchsize parameter in the predict() function. How to get unique values?

corochann commented 5 years ago

Which model (network) you are using? Can you share your prediction code?

Please refer typical usage for predict method. https://github.com/pfnet-research/chainer-chemistry/blob/master/examples/qm9/predict_qm9.py#L146

sleeper2173 commented 5 years ago

train_qm9.py.txt predict_qm9.py.txt

Thank you for your quick reply. I attached the code files train_qm9.py (same as the original) and predict_qm9.py (just modified few lines). Our problem could be reproduced as follows: 1) training a qm9 model python train_qm9.py \ --method ggnn \ --label A \ --conv-layers 1 \ --gpu 0 \ --epoch 10 \ --unit-num 10 \ --num-data 100

2) predict the property of the first molecule python predict_qm9.py \ --method ggnn \ --label A \ --gpu -1 \ --num-data 1

3) predict the properties of the first 10 molecules python predict_qm9.py \ --method ggnn \ --label A \ --gpu -1 \ --num-data 10

I get the difference between predicted values of the first molecule at 2) and 3).

corochann commented 5 years ago

Thank you for the report.

What we know is actually current GGNN model is not input size invariant. When 0-vector is padded for the "virtual node", its output value changes. We are going to fix this by the following PR. https://github.com/pfnet-research/chainer-chemistry/pull/311

chainer / chainer-chemistry

prediction values changing with a number of input molecules #363