How to output predicted values?

TaTKSM commented 5 years ago

Hi,

I have a question about "chainer-chemistry/examples/own_dataset/". In the example code "predict_own_dataset.py" the outcomes of the prediction are neither displayed nor saved. Instead it only yields simple metrics for evaluation such as MAE. But I'd like to see every single prediction. Then what would be the minimal modification of the code that enables the code to yield all the predicted values?

I looked into the codes for anothe example of qm9 and from there I have just copy-and-pasted the following snippet into "predict_own_dataset.py".

def extract_inputs(batch, device=None): 
    return concat_mols(batch, device=device)[:-1] 

def postprocess_fn(x): 
    if scaler is not None: 
        scaled_x = scaler.inverse_transform(x) 
        return scaled_x 
    else: 
        return x 

print(regressor.predict(test, converter=extract_inputs, postprocess_fn=postprocess_fn))

which somehow worked. However I'm not 100% sure if this is the right thing to do. Is this the correct way to extract the predicted values, or is there any other (perhaps more elegant) way?

Thanks in advance.

corochann commented 5 years ago

Yes, as you written this code is the way to predict values. Ref QM9 examples.

https://github.com/pfnet-research/chainer-chemistry/blob/master/examples/qm9/predict_qm9.py#L133-L148

Below is some explanations,

regressor owns predictor when created (here), and this predictor is used when you call regressor.predict
this predictor only needs input x information (usually atom and adj) but test dataset contains label information. converter=extract_inputs argument is to remove label information from dataset before feeding into predictor
additional preprocessing & postprocessing can be written if necessary. here postprocess_fn is specified because the predictor is trained to predict standard scaled label values during the training. We need to "inverse transform" the value to original scale for actual value prediction.

TaTKSM commented 5 years ago

Many thanks for your kind reply. Your answer greatly helped me decipher the code.

Here's one more related question. I've run python train_own_dataset.py --epoch 1000 to train the code on the small sample data set (dataset_train.csv). After such a long long training, the values of main/mean_abs_error and main/root_mean_sqr_error have finally dropped to 0.001759 and 0.002538, respectively. Apparently this is an overfitting regime where the model almost "memorizes" all the training samples. To confirm this is the case, I've run python predict_own_dataset.py --datafile dataset_train.csv to make predictions on the training set, which yielded the predicted values

-0.24282945692539215 -0.24397610127925873 -0.24398304522037506, ...

together with the output Evaluation result: {'main/loss': 0.00017645423273885777, 'main/mean_abs_error': 0.0001305427091817061, 'main/root_mean_sqr_error': 0.00023827564048891267}. However the CORRECT target values in dataset_train.csv are

-0.2274 -0.2678 -0.2685 ...

So the bottom line is that the model shows underfitting instead of overfitting. It is a surprise for me that after 1000 epochs the model still doesn't fit the training data well. What's going on here? Note also that the values of main/mean_abs_error and main/root_mean_sqr_error in the Evaluation result quoted above are too small in view of this underfitting. What do they actually measure? Thanks.

TaTKSM commented 5 years ago

I think I found a solution to the puzzle. I modified the line print(regressor.predict(test, converter=extract_inputs, postprocess_fn=postprocess_fn)) to print(regressor.predict(test, converter=extract_inputs, postprocess_fn=lambda x: x)) . Then the code yields correct prediction values. So the inverse transform was not necessary! The reason is probably because the inverse transform is already built in the class ScaledGraphConvPredictor defined in predict_own_dataset.py.

Do you agree that this is the right remedy?

corochann commented 5 years ago

Thank you for investigating issue, yeah I think so!

TaTKSM commented 5 years ago

A new problem (bug?) was found regarding the methods "schnet" and "weavenet", which I report now.

Since I have understood the correct way to extract prediction values (see my previous reply), next I tested all the methods in this model: nfp, ggnn, schnet, weavenet, rsgcn, relgcn and relgat, which can be specified in the --method option of the code. I trained the model on "dataset_train.csv" for 10 epochs and then tried prediction on "dataset_test.csv" which includes just 10 samples. I expected (of course) that 10 predictions would be returned, but to my surprise, the number of prediction values returned by each method was as given below.

nfp: 10
ggnn: 10
schnet: 9
weavenet: 7
rsgcn: 10
relgcn: 10
relgat: 10

What's happening with schnet and weavenet? They seem to fail on some molecules. At least they seem to "ignore" some molecules. Is this a known bug? I didn't change any hyperparameter of the model, so anyone should be able to reproduce this behavior immediately. (I specified the same method in training and prediction, needless to say.)

To gain more understanding, I've also run the code "predict_own_dataset.py" on "dataset_train.csv" which includes 90 samples. Here's the number of prediction values I got by specifying each method:

nfp: 90
ggnn: 90
schnet: 88
weavenet: 78
rsgcn: 90
relgcn: 90
relgat: 90

So, schnet fails on 2 molecules and weavenet fails on 12 molecules. What's causing this strange behavior?

corochann commented 5 years ago

Preprocessing is depending on network, and some molecular feature extraction fail within rdkit function (some feature cannot be extracted for some feature for some molecule).

Especially, SchNet need to calculate 3-D position of atom, and weavenet calculates a lot of molecule feature and it somtimes fail.

chainer / chainer-chemistry

How to output predicted values? #362