Open TaTKSM opened 5 years ago
Yes, as you written this code is the way to predict values. Ref QM9 examples.
Below is some explanations,
regressor
owns predictor
when created (here), and this predictor
is used when you call regressor.predict
predictor
only needs input x
information (usually atom
and adj
) but test
dataset contains label
information. converter=extract_inputs
argument is to remove label information from dataset before feeding into predictor
postprocess_fn
is specified because the predictor
is trained to predict standard scaled label values during the training. We need to "inverse transform" the value to original scale for actual value prediction.Many thanks for your kind reply. Your answer greatly helped me decipher the code.
Here's one more related question. I've run python train_own_dataset.py --epoch 1000
to train the code on the small sample data set (dataset_train.csv). After such a long long training, the values of main/mean_abs_error
and main/root_mean_sqr_error
have finally dropped to 0.001759 and 0.002538, respectively. Apparently this is an overfitting regime where the model almost "memorizes" all the training samples. To confirm this is the case, I've run python predict_own_dataset.py --datafile dataset_train.csv
to make predictions on the training set, which yielded the predicted values
-0.24282945692539215 -0.24397610127925873 -0.24398304522037506, ...
together with the output Evaluation result: {'main/loss': 0.00017645423273885777, 'main/mean_abs_error': 0.0001305427091817061, 'main/root_mean_sqr_error': 0.00023827564048891267}
.
However the CORRECT target values in dataset_train.csv are
-0.2274 -0.2678 -0.2685 ...
So the bottom line is that the model shows underfitting instead of overfitting. It is a surprise for me that after 1000 epochs the model still doesn't fit the training data well. What's going on here?
Note also that the values of main/mean_abs_error
and main/root_mean_sqr_error
in the Evaluation result quoted above are too small in view of this underfitting. What do they actually measure?
Thanks.
I think I found a solution to the puzzle. I modified the line
print(regressor.predict(test, converter=extract_inputs, postprocess_fn=postprocess_fn))
to
print(regressor.predict(test, converter=extract_inputs, postprocess_fn=lambda x: x))
.
Then the code yields correct prediction values. So the inverse transform was not necessary!
The reason is probably because the inverse transform is already built in the class ScaledGraphConvPredictor
defined in predict_own_dataset.py.
Do you agree that this is the right remedy?
Thank you for investigating issue, yeah I think so!
A new problem (bug?) was found regarding the methods "schnet" and "weavenet", which I report now.
Since I have understood the correct way to extract prediction values (see my previous reply), next I tested all the methods in this model: nfp, ggnn, schnet, weavenet, rsgcn, relgcn and relgat, which can be specified in the --method option of the code. I trained the model on "dataset_train.csv" for 10 epochs and then tried prediction on "dataset_test.csv" which includes just 10 samples. I expected (of course) that 10 predictions would be returned, but to my surprise, the number of prediction values returned by each method was as given below.
What's happening with schnet and weavenet? They seem to fail on some molecules. At least they seem to "ignore" some molecules. Is this a known bug? I didn't change any hyperparameter of the model, so anyone should be able to reproduce this behavior immediately. (I specified the same method in training and prediction, needless to say.)
To gain more understanding, I've also run the code "predict_own_dataset.py" on "dataset_train.csv" which includes 90 samples. Here's the number of prediction values I got by specifying each method:
So, schnet fails on 2 molecules and weavenet fails on 12 molecules. What's causing this strange behavior?
Preprocessing is depending on network, and some molecular feature extraction fail within rdkit function (some feature cannot be extracted for some feature for some molecule).
Especially, SchNet need to calculate 3-D position of atom, and weavenet calculates a lot of molecule feature and it somtimes fail.
Hi,
I have a question about "chainer-chemistry/examples/own_dataset/". In the example code "predict_own_dataset.py" the outcomes of the prediction are neither displayed nor saved. Instead it only yields simple metrics for evaluation such as MAE. But I'd like to see every single prediction. Then what would be the minimal modification of the code that enables the code to yield all the predicted values?
I looked into the codes for anothe example of qm9 and from there I have just copy-and-pasted the following snippet into "predict_own_dataset.py".
which somehow worked. However I'm not 100% sure if this is the right thing to do. Is this the correct way to extract the predicted values, or is there any other (perhaps more elegant) way?
Thanks in advance.