Closed Running-z closed 6 years ago
Hi, this is because the --dataset davis
parameter will invoke the load_davis()
function, and in the load_davis()
function, the frac_train
is set at 0.8 (the default value), resulting in only 80% of the data being output in the prediction file. Please refer to this line in load_nci60()
function to learn how to set the split ratios: https://github.com/simonfqy/PADME/blob/5e97ba97f1389ea975b196a31b3464ca2cd00512/dcCustom/molnet/load_function/nci60_dataset.py#L93
My implementation of load_nci60()
function is very specific. It relies on ToxCast, which is a very complicated dataset. The problem with you directly using load_davis()
is that, the transformer
used inside is not the transformer calculated on the original dataset (davis dataset in this case), but your own dataset for prediction. So as I said before, I suggest you create a "template dataset" for the program to know the drug-target pairs to be predicted; its format should be identical to the restructured.csv
files in the dataset folders like davis_data/
, the only difference being the drug-target pairs' interaction should all be 0 (or any other random number, but 0 is more convenient). Making the interaction all-zero will help you see the problems, should any of them arise. I decide to include a better function called the load_customized()
to give you a better piece of code for predicting the DTI using trained model, but generating the "template dataset" is a task of your own. Wait a while for me to update it.
@simonfqy Ok, I will continue to try the method you gave, thank you for your patience.
@simonfqy
I ran my drive4_d_warm.sh
file to train my data, then I did different training by modifying different batch_size and learning_rate to get different models and predict my data separately, but my unexpected error:
Traceback (most recent call last):
File "/home/zh/anaconda3/envs/deep2.0.0/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1350, in _do_call
return fn(*args)
File "/home/zh/anaconda3/envs/deep2.0.0/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1329, in _run_fn
status, run_metadata)
File "/home/zh/anaconda3/envs/deep2.0.0/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py", line 473, in __exit__
c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.InvalidArgumentError: Assign requires shapes of both tensors to match. lhs shape= [128] rhs shape= [64]
[[Node: save/Assign_2 = Assign[T=DT_FLOAT, _class=["loc:@BatchNormalization_18/BatchNormalization_18_beta"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/device:CPU:0"](BatchNormalization_18/BatchNormalization_18_beta, save/RestoreV2_2)]]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "driver.py", line 699, in <module>
tf.app.run(main=run_analysis, argv=[sys.argv[0]] + unparsed)
File "/home/zh/anaconda3/envs/deep2.0.0/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 124, in run
_sys.exit(main(argv))
File "driver.py", line 278, in run_analysis
prediction_file=csv_out)
File "/project/git2/PADME/dcCustom/molnet/run_benchmark_models.py", line 194, in model_regression
model.predict(train_dataset, transformers=transformers, csv_out=prediction_file, tasks=tasks)
File "/project/git2/PADME/dcCustom/models/tensorgraph/tensor_graph.py", line 642, in predict
self.restore()
File "/project/git2/PADME/dcCustom/models/tensorgraph/tensor_graph.py", line 1066, in restore
saver.restore(self.session, checkpoint)
File "/home/zh/anaconda3/envs/deep2.0.0/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1686, in restore
{self.saver_def.filename_tensor_name: save_path})
File "/home/zh/anaconda3/envs/deep2.0.0/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 895, in run
run_metadata_ptr)
File "/home/zh/anaconda3/envs/deep2.0.0/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1128, in _run
feed_dict_tensor, options, run_metadata)
File "/home/zh/anaconda3/envs/deep2.0.0/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1344, in _do_run
options, run_metadata)
File "/home/zh/anaconda3/envs/deep2.0.0/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1363, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Assign requires shapes of both tensors to match. lhs shape= [128] rhs shape= [64]
[[Node: save/Assign_2 = Assign[T=DT_FLOAT, _class=["loc:@BatchNormalization_18/BatchNormalization_18_beta"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/device:CPU:0"](BatchNormalization_18/BatchNormalization_18_beta, save/RestoreV2_2)]]
Caused by op 'save/Assign_2', defined at:
File "driver.py", line 699, in <module>
tf.app.run(main=run_analysis, argv=[sys.argv[0]] + unparsed)
File "/home/zh/anaconda3/envs/deep2.0.0/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 124, in run
_sys.exit(main(argv))
File "driver.py", line 278, in run_analysis
prediction_file=csv_out)
File "/project/git2/PADME/dcCustom/molnet/run_benchmark_models.py", line 194, in model_regression
model.predict(train_dataset, transformers=transformers, csv_out=prediction_file, tasks=tasks)
File "/project/git2/PADME/dcCustom/models/tensorgraph/tensor_graph.py", line 642, in predict
self.restore()
File "/project/git2/PADME/dcCustom/models/tensorgraph/tensor_graph.py", line 1065, in restore
saver = tf.train.Saver(var_list=var_list)
File "/home/zh/anaconda3/envs/deep2.0.0/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1239, in __init__
self.build()
File "/home/zh/anaconda3/envs/deep2.0.0/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1248, in build
self._build(self._filename, build_save=True, build_restore=True)
File "/home/zh/anaconda3/envs/deep2.0.0/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1284, in _build
build_save=build_save, build_restore=build_restore)
File "/home/zh/anaconda3/envs/deep2.0.0/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 765, in _build_internal
restore_sequentially, reshape)
File "/home/zh/anaconda3/envs/deep2.0.0/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 440, in _AddRestoreOps
assign_ops.append(saveable.restore(tensors, shapes))
File "/home/zh/anaconda3/envs/deep2.0.0/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 160, in restore
self.op.get_shape().is_fully_defined())
File "/home/zh/anaconda3/envs/deep2.0.0/lib/python3.6/site-packages/tensorflow/python/ops/state_ops.py", line 276, in assign
validate_shape=validate_shape)
File "/home/zh/anaconda3/envs/deep2.0.0/lib/python3.6/site-packages/tensorflow/python/ops/gen_state_ops.py", line 59, in assign
use_locking=use_locking, name=name)
File "/home/zh/anaconda3/envs/deep2.0.0/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/home/zh/anaconda3/envs/deep2.0.0/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3160, in create_op
op_def=op_def)
File "/home/zh/anaconda3/envs/deep2.0.0/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1625, in __init__
self._traceback = self._graph._extract_stack() # pylint: disable=protected-access
InvalidArgumentError (see above for traceback): Assign requires shapes of both tensors to match. lhs shape= [128] rhs shape= [64]
[[Node: save/Assign_2 = Assign[T=DT_FLOAT, _class=["loc:@BatchNormalization_18/BatchNormalization_18_beta"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/device:CPU:0"](BatchNormalization_18/BatchNormalization_18_beta, save/RestoreV2_2)]]
I used the same forecast data, which made me very upset. I didn't solve this problem today.
@Running-z The parameters --predict_only
and --restore_model
will load the model already trained. Now that you want to train from scratch, you should remove the --predict_only
parameter. The problem with your existing implementation is that you are using the model that is already trained, but you're now trying to use a set of different hyperparameters, which would result in a different model. So the error is expected.
I will write my load_customized()
function a bit later, but hopefully soon enough.
I suggest you read the documentation in driver.py
about the parameters like --predict_only
more carefully, hopefully, it will prevent you from making future mistakes.
@simonfqy
Ok, I used the source code before you didn't have the --restore_model
parameter. Today I follow the method you said. I still got the same error. My goal is to predict me using the trained model. The data is not training. I modified the nb_epoch
in the preset_hyper_parameters.py
file separately, the learning_rate
training, and then saved the three trained models. I want to use these three models for prediction separately, so I don't need - -predict_only
? According to your --restore_model
parameter, if --predict_only
is True
, --restore_model
is also True
, but I still get the same error.
Another question is, since your model method is from deepchem, I also saw that you saved the model using the save method in deepchem, but there is no model.pickle
file in the training result, you did not use the save
method. So I think, the model that is trained so that the new data will be loaded with weights? Will the result be a random result, because I trained a model yesterday, predicting new data, predicting values and The error between real values is very large, just like random
@Running-z If you're using the same set of parameters (except nb_epoch
which does not matter) for training and predicting, you should be able to get the correct result. Are you sure that you used the changed hyperparameters for training? Because if you do, I don't understand why it is the case.
Regarding your second comment, though I don't store model.pickle
(I removed the line because the pickle file is not useful for me), the model trained that is stored in --model_dir
contains everything about the model, including number of neurons in each layer, all the links between neurons and the weight of those edges. And I plotted the "new" data points using the trained model, and they make sense. (I will include the pictures in my paper) You said that the error with real values is very large, so I think you are doing something wrong. One thing could be that, in your "template dataset", the interaction values are non-zero, and the normalization transformer calculates the mean and standard deviations of those values, when they should actually be the same transformer (and using the mean and standard deviations of the training dataset) when you were training the data.
You can provide me with more information if you want. I will try to implement and upload the load_customized()
function within 24 hours.
@simonfqy
Yes, I first modify the batch_size
, nb_epoch
, and learning_rate
values of the graphconverg
model in preset_hyper_parameters.py
to perform the first model training. When the training starts, I change the batch_size
, nb_epoch
, learning_rate
values of the graphconverg
model in preset_hyper_parameters.py
again, and then again. Training, when both models are trained, I changed the model_dir
path to predict, but I got the above error
My interaction value is non-zero, and my training data looks like this:
My forecast data looks like this:
The true_pX in the image below represents the true value, and pre_pX represents the predicted value. It can be seen that the error is really unacceptable:
My training data and forecast data are all loaded using davis data, so the processing should be exactly the same. I don't understand what you mean by normalization transformers the mean and standard deviations, or what else I did wrong. ?
@Running-z I have updated the load_nci60()
function as a temporary quick and dirty solution. You can refer to drive_nci60.sh
and drive4_nci60.sh
to see how to use it. You should also read the documentation in
https://github.com/simonfqy/PADME/blob/d2d307fe17e1229add45f0c82bd50ed12bbfae35/dcCustom/molnet/load_function/nci60_dataset.py#L25
Because if you don't read the documentation and just start head on, you will hit bugs. Especially you should change your the name of your template prediction file as prescribed in the documentation.
You need to really have a look at the code to understand where transformer
comes into play. To put it simple, NormalizationTransformer
calculates the mean and standard deviation of the raw data, and then transforms the raw data into normalized z-scores, and the system (load_****()
functions) then stores them into the DiskDataset
object. So in training and prediction, you are trained on z-scores, and the DNN outputs z-scores, then the NormalizationTransformer
was used again to transform the z-scores back to the correct values.
I assume you didn't commit mistakes. But for caution, I put some of them here:
When you're training a model and predicting based on it, make sure that in both times the --model_dir
parameter is the same.
Make sure that you store the different models in different directories, as specified in --model_dir
.
The hyperparameter you used could be performing very poorly. That might be a reason for the bad performance.
Every time you change the data you use but using the same load_xxxx()
function, you must make sure that the original directory corresponding to the old DiskDataset
object is renamed or deleted, otherwise the old dataset would be reloaded. For example, if you have used davis
dataset before for training and have a folder davis_data/GraphConv/
storing the data, and you're now using your own data for training but still using the load_davis()
function, you must either remove the davis_data/GraphConv/
directory, or remove it, such that when executing this load function, the new dataset can be processed. The same goes for predicting. You must be VERY careful with it.
If anything goes wrong with load_nci60()
function, tweak it as you wish. Note that the load_davis
function that you will use will depend on the availability of the davis DiskDataset
that you already created, so the load_nci60()
function depends on the dataset you use for training the model.
https://github.com/simonfqy/PADME/blob/d2d307fe17e1229add45f0c82bd50ed12bbfae35/dcCustom/molnet/load_function/nci60_dataset.py#L71
Will update this comment later, if I can think of other points to make.
@simonfqy
Ok, thank you for your patience. I have noticed which questions you have actually mentioned, but why the predictions are not accurate, I use your default parameters, and modify those hyperparameters according to my ideas, but the results are obtained. Almost the same difference, the prediction results are mostly negative, I feel that this is worse than the random prediction results, not very reason why the super parameters are not good, at the same time, I think the parameters of the model should also be customizable, such as GraphConvModel
model Graph_conv_layers
, etc.
In addition, your load_nci60 () function is still useless, because I am still training the model, I hope you can use your new function to load data prediction after training.
@Running-z The GraphConvModel
is already quite customizable, you can do it yourself if you want more flexibility.
You can try using load_nci60()
with tf_regression
model, which takes a much shorter time to train and you should already have the trained model of it.
What do you mean when you wrote "I hope you can use your new function to load data prediction after training"?
@simonfqy Can GraphConvModel be customized? I don't seem to see the parameters of the model freely defined, such as the Graph_conv_layers parameter. The last question you said, sorry, I said wrong, I want to say that the load_nci60() function has not been used yet, because I am still training the model using the hyperparametric search method you provided. I hope that I am After training the model, I can use your load_nci60() function to load the data and predict it correctly.
I don't think it can be customized. Perhaps I could look at it over the weekend.
@simonfqy Okay thank you
@simonfqy
Hello, I remember that you said before, during the training, because theNormalizationTransformer
is used, the mean and standard deviation of the data are calculated, so I think the reason why my prediction result is completely different from the real value is probably the prediction. The result is not backcalculated using NormalizationTransformer
. Maybe the NormalizationTransformer
in your prediction code is useless. What do you think?
That's not right. I didn't change the code related to this functionality from the DeepChem
.
You should really look into the code carefully to figure out the problem. My response is that the results are transformed back in this line:
https://github.com/simonfqy/PADME/blob/39dff90592f5142233ece5a95ebd95f1ef6e5649/dcCustom/models/tensorgraph/tensor_graph.py#L543
Closing again.
I trained a model, I want to try the predictions, I use your latest code, modify the
drive_nci60.sh
file, and then predict that I have a total of 11557 data, but the final prediction only gets 9245 data, so you mentioned before , the forecast does not divide the data, it seems that the expected results have not been achieved. This is the total number of predicted results:This is the modified
drive_nci60.sh
file: