chainer / chainer-chemistry

Chainer Chemistry: A Library for Deep Learning in Biology and Chemistry
MIT License
618 stars 129 forks source link

Numpy related error in NFP tutorial #406

Closed nshervt closed 3 years ago

nshervt commented 4 years ago

When following NFP's homo prediction tutorial, recieved the following error

ValueError: Object arrays cannot be loaded when allow_pickle=False

When ran the following code

from chainer.datasets import split_dataset_random
from chainer_chemistry import datasets as D
from chainer_chemistry.dataset.preprocessors import preprocess_method_dict
from chainer_chemistry.datasets import NumpyTupleDataset

cache_dir = 'input/homo/'
dataset = NumpyTupleDataset.load(cache_dir + 'data.npz')
corochann commented 4 years ago

what is your chainer and chainer-chemistry version? can you upgrade and try again?

pip install -U chainer chainer-chemistry

nshervt commented 4 years ago

Ha! look at that! Thanks, that was the issue. One more point: In the tutorial, you also need to include import os.

Following the rest of the tutorial, I tried to invoke the training step but received the following error:

Traceback (most recent call last):
  File "/Users/Nshervt/Private/Research/Scattering/NFP_test/train.py", line 13, in <module>
    from chainer_chemistry.dataset.converters import converter_method_dict
ImportError: cannot import name 'converter_method_dict' from 'chainer_chemistry.dataset.converters' (/Users/Nshervt/anaconda3/lib/python3.7/site-packages/chainer_chemistry/dataset/converters.py)

I was thinking maybe it would be a better idea to continue the steps for training and prediction in the same fashion that loading, saving, and the model definition are discussed step by step.

At the moment, (correct me if I'm wrong, but) it seems that a model is defined, then we through that out and say "ok, if you go to the train script there are bunch of models there that are defined in a similar fashion and you learned the math and should be able to do the rest" (or maybe it's just that I'm too naive).

corochann commented 4 years ago

Sorry, converter_method_dict is added in master branch but not included in v0.6.0 release. I guess you are runnning the code in master branch with v0.6.0 library. Please try either of the following:

  1. Install master branch, and run master branch example:
git clone https://github.com/pfnet-research/chainer-chemistry.git
pip install -e chainer-chemistry
# Run your NFP example with cloned code...

OR

  1. Install v0.6.0
git clone https://github.com/pfnet-research/chainer-chemistry.git -b v0.6.0
pip install chainer-chemistry==0.6.0
# Run your NFP example with cloned code...
corochann commented 4 years ago

In the latter comment, do you mean more example or document is needed to understand the example? Sorry for less document for now...

Please refer this tutorial for the more detailed document to learn basics.

nshervt commented 4 years ago

Thanks, I was referring to the same tutorial, it needs more details. I followed it up to Run, I loaded the data, I defined my model (although I'm simply importing MLP and NFP, however I understand what is happening), now I don't understand how am I suppose to train the model that we just defined. The tutorial is using

~/chainer-chemistry/examples/qm9$ python train_qm9.py --method nfp --label homo --gpu 0

Now it doesn't seem that --method nfp has anything to do with the model that I just defined few paragraphs earlier. What if I did a slight modification to the model and wanted to test it? I looked at the codes in the repo and it seems that for predicting you have further defined predict function in the model. What I was saying was that maybe (for someone like me that is trying to figure out a nice repo that s/he found on the internet is worth investing time) it would be a good idea to add 5 line of code that would train this model and 5 line of code that would do prediction (similar to what you have done in Model Definition and Dataset Preparation).

Let me finish my ranting (my apologies) by summing up:

  1. How can I train the model defined earlier as model = GraphConvPredictor(NFP(n_unit, n_unit, conv_layers), MLP(n_unit, 1)) not using the command line as suggested in the tutorial ~/chainer-chemistry/examples/qm9$ python train_qm9.py --method nfp --label homo --gpu 0 but in the python code that we defined the model.
  2. How to proceed to do prediction using the model that we just defined.

Thank you very much!

corochann commented 4 years ago
  1. model = GraphConvPredictor(NFP(n_unit, n_unit, conv_layers), MLP(n_unit, 1)) is done in set_up_predictor method in the example.

https://github.com/pfnet-research/chainer-chemistry/blob/master/examples/qm9/train_qm9.py#L143-L144

So you can just replace this line to use your custom defined model above.

  1. Further, predict method and loss calculation is defined in Regressor class for regression task and Classifier class for classification task. So you need to "wrap" your model using Regressor or Classifier. Since it defines loss calculation and predict method, it can be used in both training and prediction.

Refer: https://github.com/pfnet-research/chainer-chemistry/blob/master/examples/qm9/train_qm9.py#L149-L150

nshervt commented 4 years ago

Thanks! I'll check this soon and comment for closing the issue.