Closed maykcaldas closed 1 year ago
Hey @maykcaldas,
I am not really sure what you are trying to do with the tokenizer if you want to access the encoder output.
If you want to access the encoder output, have a look at DECIMER/Predictor_EfficientNet2.py
In that file, add a function called get_encoder_output
(I have copied this from evaluate()
and left out the bits that process the image embedding with the Transformer decoder, as we just want the image embedding here):
def get_encoder_output(image_path: str):
"""
This function takes an image path (str) and returns the encoder output.
Args:
image_path (str): Path of chemical structure depiction image
Returns:
(tf.Tensor): image embedding
"""
sample = config.decode_image(image_path)
_image_batch = tf.expand_dims(sample, 0)
image_embedding = encoder(_image_batch, training=False)
return image_embedding
If you then call that function, you get the EfficientNet V2 encoder output.
I hope this helps! Have a nice day! :)
Hey @OBrink , thanks for your suggestion! Me and @smichtavy were working on that today.
I tried it, but I had some problems:
import DECIMER.efficientnetv2 as efficientnetv2
to use the EffNetV2Model class. What might be an environment problem, but the efficientnetv2
directory is not a module. I included a init file in efficientnetv2
to import EffNetV2Model
1.2. In the config.py file, we had the import import DECIMER.Transformer_decoder
, but it was calling Transformer_decoder
. I changed it to call DECIMER.Transformer_decoder
.
1.3. in the efficientnetv2/utils.py file, the requirements.txt is missing the tensorflow_addons package.After solving these three issues, I could run the Predictor_EfficientNet2.py script. It was missing the tokenizer_Isomeric_SELFIES and max_length_Isomeric_SELFIES pickle files, I used the ones provided in zenodo: https://zenodo.org/record/8093783/files/models.zip (needed to rename the files). But running raised a shape error. Do you have any idea what may be the cause of it?
TFA has entered a minimal maintenance and release mode until a planned end of life in May 2024.
Please modify downstream libraries to take dependencies from other repositories in our TensorFlow community (e.g. Keras, Keras-CV, and Keras-NLP).
For more information see: https://github.com/tensorflow/addons/issues/2807
warnings.warn(
Traceback (most recent call last):
File "Predictor_EfficientNet2.py", line 179, in <module>
main()
File "Predictor_EfficientNet2.py", line 100, in main
SMILES = predict_SMILES(sys.argv[1])
File "Predictor_EfficientNet2.py", line 170, in predict_SMILES
predicted_SELFIES = evaluate(image_path)
File "Predictor_EfficientNet2.py", line 119, in evaluate
_image_embedding = encoder(_image_batch, training=False)
File "/home/maykcaldas/.local/lib/python3.8/site-packages/keras/utils/traceback_utils.py", line 70, in error_handler
raise e.with_traceback(filtered_tb) from None
File "/home/maykcaldas/.local/lib/python3.8/site-packages/DECIMER/Efficient_Net_encoder.py", line 80, in call
x = self.reshape(x, training=training)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Exception encountered when calling layer 'image_embedding' (type Reshape).
{{function_node __wrapped__Reshape_device_/job:localhost/replica:0/task:0/device:CPU:0}} Input to reshape is a tensor with 59392 values, but the requested shape has 23200 [Op:Reshape]
Call arguments received by layer 'image_embedding' (type Reshape):
• inputs=tf.Tensor(shape=(1, 16, 16, 232), dtype=float32)```
Hey @maykcaldas & @smichtavy,
I am sorry for the confusion, I will look into cleaning up a couple of things in the repository.
I found an easier solution that saves us a lot of trouble, and I confirmed that it works:
1) git clone https://github.com/Kohulan/DECIMER-Image_Transformer
2) cd DECIMER-Image_Transformer
3) in DECIMER/__init__.py
, replace the imports with:
from .decimer import predict_SMILES, DECIMER_V2
from . import config
4) pip install .
5) in your Python code, call the encoder the following way:
from DECIMER import DECIMER_V2, config
import tensorflow as tf
encoder = DECIMER_V2.DECIMER.encoder
def get_encoder_output(image_path: str):
"""
This function takes an image path (str) and returns the encoder output.
Args:
image_path (str): Path of chemical structure depiction image
Returns:
(tf.Tensor): image embedding
"""
sample = config.decode_image(image_path)
_image_batch = tf.expand_dims(sample, 0)
image_embedding = encoder(_image_batch, training=False)
return image_embedding
get_encoder_output(image_path)
If I run this on /Tests/caffeine.png
, I get:
<tf.Tensor: shape=(1, 256, 512), dtype=float32, numpy=
array([[[-0.05972348, -0.22405794, 0.26993287, ..., 0.25145775,
-0.10838411, -0.24237064],
[ 0.20004356, -0.53331375, -0.04955457, ..., 0.16480035,
-0.03783005, -0.26316923],
[ 0.57722473, -0.38815224, 0.33489236, ..., -0.04090453,
-0.03352015, -0.32710063],
...,
[ 0.83991385, 0.2703736 , 0.2658328 , ..., 0.21896963,
-0.20168161, 0.29251066],
[ 0.82687134, -1.2922107 , -0.25024006, ..., -0.64587194,
-1.0208106 , 0.07405327],
[ 0.44606146, -0.1196775 , 1.3218036 , ..., -0.4189508 ,
-1.0974574 , -0.6215218 ]]], dtype=float32)>
Let us know if you have any further trouble! I'll wait until I hear from you to close this issue.
Have a nice weekend! :) Otto
Thank you very much! This one is much cleaner! Appreciate that!!
Feel free to close this issue :)
Have a great weekend you too!
I've reopened the issue since we have to update the code on the Predictor code to use checkpoints. Will close it once the issue is solved.
Issue Type
Questions
Source
PyPi
DECIMER Image Transformer Version
2.3.0
OS Platform and Distribution
MacBook Pro M1, 2020
Python version
3.10
Current Behaviour?
Hey!
Is there a way to access the encoder output using decimer's loaded model? I'm interested in the embedded representation that is fed to the decoder, not the smiles itself. I was wondering if it's possible to access them once the
Transformer
class calls the encoder and the decoder separately.I could reproduce the
predict_SMILES
function by loading the model from the checkpoint available in zenodo, but since it's a TF model, I can only__call__
it.Is there any possible way to load these weights into the
Transformer
class so I can call thet_encoder
to access theenc_output
? Having an argument in the call to expose the hidden_states would also work fine.Any suggestion is welcome! Thanks! Mayk
Which images caused the issue? (This is mandatory for images related issues)
No response
Standalone code to reproduce the issue
Relevant log output
No response
Code of Conduct