MartinoMensio / spacy-universal-sentence-encoder

Google USE (Universal Sentence Encoder) for spaCy
MIT License
176 stars 12 forks source link

How can I apply tf.function decorator to this interface? #16

Closed TheoSeo93 closed 3 years ago

TheoSeo93 commented 3 years ago

Hi ! Thank you for creating this interface. It works pretty fast for me. While running the text similarity with "en_use_cmlm_md" model, I found the log that says something about tf.function retracing. Compared to the original models, the new models (cmlm) are much slower. So I tried to look for any way to improve the speed. I would assume tf.function could boost up the performance, given that I restrict the shape and datatype of Tensors.

triggered tf.function retracing. Tracing is expensive and the excessive number of tracings is likely due to passing python objects instead of tensors. Also, tf.function has experimental_re
lax_shapes=True option that relaxes argument shapes that can avoid unnecessary retracing. Please refer to https://www.tensorflow.org/tutorials/customization/performance#python_or_tensor_args and https://www.tensorflow.org/api_docs/python/tf/function for more details.

Is there any way that I can increase the performance of calculating semantic similarity by using tf.function on this interface? It seems your interface would work much faster than what I implemented by the tensorflow hub, given that I solve the problem from the above warning logs on this interface. Cheers !

MartinoMensio commented 3 years ago

Hi @TheoSeo93 ! Thank you for opening this issue. I also noticed that warning previously. I tried to play with the function that inside this library calls the preprocessor and embeds the documents. The issue is probably around this piece of code: https://github.com/MartinoMensio/spacy-universal-sentence-encoder/blob/master/spacy_universal_sentence_encoder/language.py#L211 I have tried to decorate this function with the @tf.function and also with @tf.function(experimental_relax_shapes=True) but after a few calls I always receive the warning.

Have you tried to use directly the model as described in https://tfhub.dev/google/universal-sentence-encoder-cmlm/en-large/1 ? At the moment my interface does not allow to process a batch of texts at the same time, and I call the encoder with one text at a time. Maybe you will increase a lot the performances if you use directly the encoder and submit your documents in batches:

import tensorflow_hub as hub
import tensorflow as tf

english_sentences = tf.constant(["dog", "Puppies are nice.", "I enjoy taking long walks along the beach with my dog."])

preprocessor = hub.KerasLayer(
    "https://tfhub.dev/tensorflow/bert_en_uncased_preprocess/3")
encoder = hub.KerasLayer(
    "https://tfhub.dev/google/universal-sentence-encoder-cmlm/en-large/1")

# here all the sentences are sent together as a batch
# and also maybe it needs to be wrapped in a @tf.function this piece ???
english_embeds = encoder(preprocessor(english_sentences))["default"]

print (english_embeds)

I am sorry but I didn't manage to solve this yet.

Best, Martino

TheoSeo93 commented 3 years ago

Thank you for such a kind response :) I tried using tf.function decorator on your module, and did comparison of 10 sentences from Brown corpus. For some reason, the warning log does not show up in the Windows environment even before using the decorator, but I did see some speed improvement after editing some of your code.

Here is what I ran to measure time.


import spacy_universal_sentence_encoder
from nltk.corpus import brown
import time

encoder = spacy_universal_sentence_encoder.load_model('en_use_cmlm_md')
sentences = brown.sents()[:10]
sen_embed = encoder("I have this")

start = time.time() 

print("time :", time.time() - start) 
for candidate in sentences:
    sent = " ".join(candidate)
    score = sen_embed.similarity(encoder(sent))
    print(score)

print("time :", time.time() - start) 

It took 5.2730 seconds before I added tf.function decorator, and it became about 4.676284 seconds after editing it. (No drastic change though) So I'm not 100% sure but, I guess it works on your code. To see if the warning log goes off or not, I would have to try it in the Ubuntu environment.

    # Specify input_signature in tf.function to limit tracing. - I followed the tensorflow hub documentation from here
    # https://www.tensorflow.org/guide/function#controlling_retracing
    @tf.function(input_signature=(tf.TensorSpec(shape=[None], dtype=tf.string),))
    def embed(self, texts):
        """Embed multiple texts"""
        # print('embed TFHubWrapper called')
        if hasattr(self.model, 'preprocessor'):
            result = self.model(self.model.preprocessor(texts))['default']
        else:
            result = self.model(texts)
        # result = np.array(result)   # Not sure how to convert Tensor to numpy array here. Would this be a bad idea?
        return result

    # extension implementation
    def embed_one(self, span):
        text = span.text
        # print('enable_cache', TFHubWrapper.enable_cache)
        if self.enable_cache and text in self.embed_cache:
            return self.embed_cache[text]
        else:
            result = self.embed(tf.constant([text]))[0] #Converted a sentence into a tf.constant 
            if self.enable_cache:
                self.embed_cache[text] = result
            return result

Thank you for suggesting the idea of putting sentences together as a batch. I tried that with the tf.function decorator that I did the same above, and the warning log went off, and it became faster. I found your codes are very clean and there's a lot for me to learn from your codes. I appreciate it. Have a great weekend !

MartinoMensio commented 3 years ago

Hi @TheoSeo93, Thank you very much for your useful information. I measured the time difference between the previous version and your suggested version on my machine (macOS) and I confirm the following:

Before your modifications: 4.73 sec With your modification (c3d6ffc): 3.94 sec With .numpy() called after the embed function (bbc6efe): 3.25 sec (I guess that spaCy finds it easier to work with numpy types instead of tensors with nested numpy)

Altogether is a huge improvement. Thank you for investigating and solving this.

Best, Martino