SpacyTokenizer load time is too high

Arjunsankarlal commented 4 years ago

Rasa version: 1.6.0 (latest)

Rasa SDK version (if used & relevant): Not relevant

Rasa X version (if used & relevant): Not relevant

Python version: 3.7.1

Operating system (windows, osx, ...): MacOS

Issue: I am using custom SpacyPipeline with the following components,

name: SpacyTokenizer name: SpacyFeaturizer name: EmbeddingIntentClassifier

after debugging I found that the time taken for loading the specifics components are,

Time taken to load component 0 is 22.385185628000002 Time taken to load component 1 is 0.00010501800000284334 Time taken to load component 2 is 5.866900000128794e-05

Error (including full traceback):

WARNING:tensorflow:From /Users/arjun-zt235/environments/rasa/lib/python3.7/site-packages/tensor2tensor/utils/expert_utils.py:68: The name tf.variable_scope is deprecated. Please use tf.compat.v1.variable_scope instead.

WARNING:tensorflow:
The TensorFlow contrib module will not be included in TensorFlow 2.0.
For more information, please see:
  * https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
  * https://github.com/tensorflow/addons
  * https://github.com/tensorflow/io (for I/O related ops)
If you depend on functionality not listed there, please file an issue.

WARNING:tensorflow:From /Users/arjun-zt235/environments/rasa/lib/python3.7/site-packages/tensor2tensor/utils/adafactor.py:27: The name tf.train.Optimizer is deprecated. Please use tf.compat.v1.train.Optimizer instead.

WARNING:tensorflow:From /Users/arjun-zt235/environments/rasa/lib/python3.7/site-packages/tensor2tensor/utils/multistep_optimizer.py:32: The name tf.train.AdamOptimizer is deprecated. Please use tf.compat.v1.train.AdamOptimizer instead.

WARNING:tensorflow:From /Users/arjun-zt235/environments/rasa/lib/python3.7/site-packages/tensor2tensor/models/research/glow_init_hook.py:25: The name tf.train.SessionRunHook is deprecated. Please use tf.estimator.SessionRunHook instead.

WARNING:tensorflow:From /Users/arjun-zt235/environments/rasa/lib/python3.7/site-packages/tensor2tensor/models/research/neural_stack.py:38: The name tf.nn.rnn_cell.RNNCell is deprecated. Please use tf.compat.v1.nn.rnn_cell.RNNCell instead.

WARNING:tensorflow:From /Users/arjun-zt235/environments/rasa/lib/python3.7/site-packages/tensor2tensor/rl/gym_utils.py:235: The name tf.logging.info is deprecated. Please use tf.compat.v1.logging.info instead.

WARNING:tensorflow:From /Users/arjun-zt235/environments/rasa/lib/python3.7/site-packages/tensor2tensor/utils/trainer_lib.py:111: The name tf.OptimizerOptions is deprecated. Please use tf.compat.v1.OptimizerOptions instead.

WARNING:tensorflow:From /Users/arjun-zt235/environments/rasa/lib/python3.7/site-packages/tensorflow_gan/python/estimator/tpu_gan_estimator.py:42: The name tf.estimator.tpu.TPUEstimator is deprecated. Please use tf.compat.v1.estimator.tpu.TPUEstimator instead.

Time taken to load component 0 is 22.385185628000002
Time taken to load component 1 is 0.00010501800000284334
Time taken to load component 2 is 5.866900000128794e-05
2019-12-20 18:30:51.333993: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2019-12-20 18:30:51.353678: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7ff40acf8190 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2019-12-20 18:30:51.353692: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
WARNING:tensorflow:From /Users/arjun-zt235/environments/rasa/lib/python3.7/site-packages/rasa/utils/train_utils.py:961: The name tf.get_collection is deprecated. Please use tf.compat.v1.get_collection instead.

Time taken to load component 3 is 0.3157015059999999
{'intent': {'name': '2', 'confidence': 0.5746800899505615}, 'entities': [], 'intent_ranking': [{'name': '2', 'confidence': 0.5746800899505615}, {'name': '3', 'confidence': 0.10587094724178314}, {'name': '8', 'confidence': 0.0751015767455101}, {'name': '6', 'confidence': 0.06461881101131439}, {'name': '5', 'confidence': 0.05602974072098732}, {'name': '0', 'confidence': 0.04509160667657852}, {'name': '9', 'confidence': 0.0442366860806942}, {'name': '1', 'confidence': 0.024302536621689796}, {'name': '7', 'confidence': 0.005946047138422728}, {'name': '4', 'confidence': 0.004121904727071524}], 'text': 'Where are you ?'}

Command or request that led to error:

interpreter = Interpreter.load('../bots/rasa/trainer_model_persist1')
print(interpreter.parse('Where are you ?'))

Content of configuration file (config.yml) (if relevant): Loaded from dict

{'language': 'en',
            'pipeline': [
                {'name': 'SpacyTokenizer'},
                {'name': 'SpacyFeaturizer'},
                {'name': 'EmbeddingIntentClassifier'}
                ],
            'data': None,
            'policies': [
                {'name': 'MemoizationPolicy'},
                {'name': 'KerasPolicy'},
                {'name': 'MappingPolicy'}
                ]
            }

Content of domain file (domain.yml) (if relevant): Not relevant

For more details please look into this forum post

Is this expected ? Or I am I doing something wrong ? By the way, thanks for the great repo! ❤️

Arjunsankarlal commented 4 years ago

I tried training and loading with the same set of data and code listed about with Python 3.6, now it is loading in 6 seconds! 😍 But I would be nice if I could still minimise the time taken for loading. If there is some way I can do this let me know! Thanks

sara-tagger commented 4 years ago

Thanks for raising this issue, @dakshvar22 will get back to you about it soon✨

Please also check out the docs and the forum in case your issue was raised there too 🤗

Arjunsankarlal commented 4 years ago

Hey @sara-tagger, thanks for the quick response. While debugging I found that this is not specific to Spacy Tokenizer! But this happens for every other configuration.

I have trained a RASA NLU model with the following config

language: en
pipeline:
- name: "pretrained_embeddings_convert"

This configuration defaults to the list of components,

language: "en"

pipeline:
- name: "SpacyNLP"
- name: "SpacyTokenizer"
- name: "SpacyFeaturizer"
- name: "RegexFeaturizer"
- name: "CRFEntityExtractor"
- name: "EntitySynonymMapper"
- name: "SklearnIntentClassifier"

Also I have tried all the other readily available configs like supervised_embeddings and pretrained_embeddings_spacy and custom configs as well. All of it takes 6~9 seconds of load time for instantiating the Trainer object. Similarly when I tried to load the persisted model for inference,

interpreter = Interpreter.load('../path_to_trained_model')

again it takes almost of same 6~9 seconds for loading it. Then I tried to find the load time of each component. The first component only took ~ 6-9 seconds rest of them are taking just micro seconds. Then I tried a non time taking component individually with ComponentBuilder which surprisingly took ~ 6-9 seconds, so my guess is that there is something else causing this delay.

Is there anyway that this can be mitigated ? or I am doing something wrong ? Because I want to serve these models on demand, which requires a faster load time.

dakshvar22 commented 4 years ago

@Arjunsankarlal SpacyNLP loads the language model provided by spacy for your specified language. I am guessing that should take up most of the time of the first component. For pretrained_embeddings_convert, ConveRTFeaturizer would take some time because it loads up a large tensorflow based model.

When you say you want to serve these models on demand, do you mean training them or making inferences through them?

Arjunsankarlal commented 4 years ago

Hey @dakshvar22 , I also doubted the same. I am loading spacy en model here. When I load the same model with spacy.load('en') for normal use, the time taken is ~ 0.6s, it is same model loaded here in rasa which is taking upto 6~9 seconds. So if you could give me any heads up for reducing the time it would be helpful.

Also when I say I want to serve these models on demand, I mean making inferences through them but loading the model when a request / first request is made. When I tried the loading the saved interpreter, the time taken for loading is ~6 secs, which I feel is very high since I will load models dynamically.

I also tried a different pipeline, which is a simple one,

def get_train_config():
    return {'language': 'en',
            'pipeline': "supervised_embeddings"}

myconfig = config._load_from_dict(get_train_config())
trainer = Trainer(myconfig, skip_validation=True)
interpreter = trainer.train(data)

In this case the load time of the config is ~ 8.12 sec which was just micro seconds when I was using the spacy pipeline, not sure about this difference :/ But here the other component loading is ultra fast.

Also one interesting finding is that, after training completed, I loaded the persisted model from the saved path on the same execution, in which the SpacyNLP component load time is just ~ 0.6 seconds. So this is not SpacyNLP specific issue. Is this happening because of Rasa start up ?

I would love to use this NLU feature, but these latency issue is a big hindrance for it. Let me know your thoughts.

stale[bot] commented 4 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] commented 4 years ago

This issue has been automatically closed due to inactivity. Please create a new issue if you need more help.

RasaHQ / rasa

SpacyTokenizer load time is too high #5007

Please also check out the docs and the forum in case your issue was raised there too 🤗