Open nturaga opened 2 years ago
Hi Nitesh.
The package keras is a useful package for DL in R. However, as you know, a running version of python is required because the native versions of tensorflow and keras are written in python and accessed by R via the reticulate package. Thus, by using keras::is_keras_available()
and reticulate::py_available()
, we need to check that they are available in the current system environment. If they are not available in R, please investigate your system environment. Thanks.
Dongmin
They are available on my machine. The fit_cpi
function fails because of an internal issue within the function,
Run within the R console:
Note : The if condition passes and it fails at fitting model...
step.
> if (keras::is_keras_available() & reticulate::py_available()) {
compound_max_atoms <- 50
protein_embedding_dim <- 16
protein_length_seq <- 100
gcn_cnn_cpi <- fit_cpi(
smiles = example_cpi[train_idx, 1],
AAseq = example_cpi[train_idx, 2],
outcome = example_cpi[train_idx, 3],
compound_type = "graph",
compound_max_atoms = compound_max_atoms,
protein_length_seq = protein_length_seq,
protein_embedding_dim = protein_embedding_dim,
protein_ngram_max = 2,
protein_ngram_min = 1,
smiles_val = example_cpi[!train_idx, 1],
AAseq_val = example_cpi[!train_idx, 2],
outcome_val = example_cpi[!train_idx, 3],
net_args = net_args,
epochs = 20,
batch_size = 64,
callbacks = keras::callback_early_stopping(
monitor = "val_accuracy",
patience = 10,
restore_best_weights = TRUE))
ttgsea::plot_model(gcn_cnn_cpi$model)
}
checking sequences...
preprocessing for compounds...
preprocessing for proteins...
fitting model...
Error in py_call_impl(callable, dots$args, dots$keywords): RuntimeError: Evaluation error: AttributeError: __module__
Detailed traceback:
File "/opt/conda/lib/python3.7/site-packages/tensorflow/python/training/tracking/base.py", line 530, in _method_wrapper
result = method(self, *args, **kwargs)
File "/opt/conda/lib/python3.7/site-packages/keras/engine/base_layer.py", line 315, in __init__
self._instrument_layer_creation()
File "/opt/conda/lib/python3.7/site-packages/keras/engine/base_layer.py", line 300, in _instrument_layer_creation
keras_layers_gauge.get_cell(self._get_cell_name()).set(True)
File "/opt/conda/lib/python3.7/site-packages/keras/engine/base_layer.py", line 287, in _get_cell_name
return self.__class__.__module__ + '.' + self.__class__.__name__
.
Detailed traceback:
File "/home/jupyter/packages/reticulate/python/rpytools/call.py", line 21, in python_function
raise RuntimeError(res[kErrorKey])
Traceback:
1. fit_cpi(smiles = example_cpi[train_idx, 1], AAseq = example_cpi[train_idx,
. 2], outcome = example_cpi[train_idx, 3], compound_type = "graph",
. compound_max_atoms = compound_max_atoms, protein_length_seq = protein_length_seq,
. protein_embedding_dim = protein_embedding_dim, protein_ngram_max = 2,
. protein_ngram_min = 1, smiles_val = example_cpi[!train_idx,
. 1], AAseq_val = example_cpi[!train_idx, 2], outcome_val = example_cpi[!train_idx,
. 3], net_args = net_args, epochs = 20, batch_size = 64,
. callbacks = keras::callback_early_stopping(monitor = "val_accuracy",
. patience = 10, restore_best_weights = TRUE)) # at line 5-24 of file <text>
2. do.call(net_args$compound, net_args$compound_args)
3. gcn_in_out(gcn_units = c(128, 64), gcn_activation = c("relu",
. "relu"), fc_units = 10, fc_activation = "relu", max_atoms = 50,
. feature_dim = 24L)
4. x %>% layer_multi_linear(units = temp_units) %>% keras::layer_activation(activation = gcn_activation[i])
5. keras::layer_activation(., activation = gcn_activation[i])
6. create_layer(keras$layers$Activation, object, list(activation = activation,
. input_shape = normalize_shape(input_shape), batch_input_shape = normalize_shape(batch_input_shape),
. batch_size = as_nullable_integer(batch_size), dtype = dtype,
. name = name, trainable = trainable, weights = weights))
7. layer_multi_linear(., units = temp_units)
8. create_layer(layer, object, .args)
9. do.call(layer_class, args)
10. (structure(function (...)
. {
. dots <- py_resolve_dots(list(...))
. result <- py_call_impl(callable, dots$args, dots$keywords)
. if (convert)
. result <- py_to_r(result)
. if (is.null(result))
. invisible(result)
. else result
. }, class = c("python.builtin.type", "python.builtin.object"), py_object = <environment>))(units = temp_units)
11. py_call_impl(callable, dots$args, dots$keywords)
It works on my Windows machine. The version information of python packages is as follows.
For fit_cpi
, I didn't run the latest version of those python packages. But, I recommend python 3.8 or higher.
These are the versions I'm using.
keras==2.8.0
tensorflow==2.8.0
Python 3.9.9
But, where do you specify the versions of these packages in your DESCRIPTION file or your R package? Also, how is a user supposed to know which versions to use specifically?
I'm inexperienced in this domain, and I've not used reticulate much. Python has a requirements.txt
file generally which goes with its packages. Maybe it's worth adding the requirements.txt in the inst/extdata
field.
Also, can you confirm that this package is not compatible with the versions I mentioned?? And see if it's an issue within python? I can do the opposite and see if your fit_cpi
function works with the package versions you mentioned.
Hi @dongminjung,
After testing your package a little bit more, it seems the criteria for running it are very stringent and they work only with python 3.8 and the versions of Keras and TensorFlow you mentioned.
I've created a test docker image based on bioconductor/bioconductor_docker:devel
to make sure it's all reproducible.
The only issue I've had is, I'm unable to run it on Python 3.9 and the latest versions of Keras and TensorFlow. Please mention within your package README the specific versions. Do you have any plans to have your package work with the latest versions of Python / Tensorflow and Keras?
The docker image is nitesh1989/bioconductor_docker:deeppincs_RELEASE_3_15
.
$ python3 --version
Python 3.8.10
$ cat requirements.txt
absl-py==0.15.0
astunparse==1.6.3
cachetools==5.0.0
certifi==2021.10.8
charset-normalizer==2.0.12
flatbuffers==1.12
gast==0.3.3
google-auth==2.6.0
google-auth-oauthlib==0.4.6
google-pasta==0.2.0
grpcio==1.32.0
h5py==2.10.0
idna==3.3
importlib-metadata==4.11.1
joblib==1.1.0
Keras==2.4.3
Keras-Preprocessing==1.1.2
Markdown==3.3.6
numpy==1.19.5
oauthlib==3.2.0
opt-einsum==3.3.0
pandas==1.4.0
protobuf==3.19.4
pyasn1==0.4.8
pyasn1-modules==0.2.8
python-dateutil==2.8.2
pytz==2021.3
PyYAML==6.0
requests==2.27.1
requests-oauthlib==1.3.1
rsa==4.8
scikit-learn==1.0.2
scipy==1.8.0
six==1.15.0
sklearn==0.0
tensorboard==2.8.0
tensorboard-data-server==0.6.1
tensorboard-plugin-wit==1.8.1
tensorflow==2.4.0
tensorflow-estimator==2.4.0
termcolor==1.1.0
threadpoolctl==3.1.0
typing-extensions==3.7.4.3
urllib3==1.26.8
Werkzeug==2.0.3
wrapt==1.12.1
zipp==3.7.0
Hi Nitesh.
Ok. I'll add the versions of packages I used to the README file. To keep up with changes, I'll update DeepPINCS with latest versions of Tensorflow, Keras packages and Python. I'll let you know if it will be done. Thank you for your help.
Hi @nturaga
I updated the development version of DeepPINCS (1.3.8) for the latest versions of Tensorflow, Keras packages and Python. Also, the versions of these packages are mentioned in the README file. Please find them. Thanks.
Hi Dongmin,
I’m part of the Bioconductor Core team, and we’ve been taking a look at your package.
Your package has an issue, where the vignette is not fully run.
Essentially in your vignette, the
DeepPINCS::fit_cpi()
function fails, leading to the error as shown in this gist https://gist.github.com/nturaga/311c05d989208e4a363d661c9c18e29cThis happens on both CPU and GPUs. Essentially the only reason your package passes on the Bioconductor build system is because that ‘if’ statement at the beginning of the model fit results to ‘FALSE’
This shouldn’t be used if you actually want to run the vignette. Please fix the error shown in the gist.
Best,
Nitesh