core training failed with data converted from rasa-demo

XiaoLiuAI commented 3 years ago

Rasa version: 2.1.0

Rasa SDK version (if used & relevant):

Rasa X version (if used & relevant):

Python version: 3.7

Operating system (windows, osx, ...): osx

Issue: core training failed

training data is converted from project rasa-demo

Error (including full traceback):

2020-11-20 20:38:46 INFO     rasa.nlu.model  - Starting to train component DIETClassifier
/Users/xiaoliu/miniconda3/envs/work-env/lib/python3.7/site-packages/rasa/nlu/classifiers/diet_classifier.py:656: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray
  model_data.add_features(key, sub_key, [np.array(_features)])
Epochs:   0%|                                                              | 0/200 [00:00<?, ?it/s]/Users/xiaoliu/miniconda3/envs/work-env/lib/python3.7/site-packages/rasa/utils/tensorflow/model_data.py:587: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray
  final_data[key][sub_key].append(np.concatenate(np.array(f)))
Epochs:  75%|█████████▊   | 150/200 [28:41<13:49, 16.58s/it, t_loss=2.241, i_acc=0.989, e_f1=0.990]Epochs: 100%|█████████████| 200/200 [39:55<00:00, 11.98s/it, t_loss=2.003, i_acc=0.995, e_f1=0.993]
2020-11-20 21:18:58 INFO     rasa.utils.tensorflow.models  - Finished training.
2020-11-20 21:18:59 INFO     rasa.nlu.model  - Finished training component.
2020-11-20 21:18:59 INFO     rasa.nlu.model  - Starting to train component DucklingEntityExtractor
2020-11-20 21:18:59 INFO     rasa.nlu.model  - Finished training component.
2020-11-20 21:18:59 INFO     rasa.nlu.model  - Starting to train component EntitySynonymMapper
2020-11-20 21:18:59 INFO     rasa.nlu.model  - Finished training component.
2020-11-20 21:18:59 INFO     rasa.nlu.model  - Starting to train component ResponseSelector
Epochs: 100%|█████████████████████████| 300/300 [01:17<00:00,  3.86it/s, t_loss=1.950, r_acc=1.000]
2020-11-20 21:20:33 INFO     rasa.utils.tensorflow.models  - Finished training.
2020-11-20 21:20:33 INFO     rasa.nlu.model  - Finished training component.
2020-11-20 21:20:33 INFO     rasa.nlu.model  - Starting to train component ResponseSelector
Epochs: 100%|█████████████████████████| 300/300 [01:59<00:00,  2.50it/s, t_loss=3.030, r_acc=0.988]
2020-11-20 21:22:49 INFO     rasa.utils.tensorflow.models  - Finished training.
2020-11-20 21:22:49 INFO     rasa.nlu.model  - Finished training component.
2020-11-20 21:22:49 INFO     rasa.nlu.model  - Starting to train component ResponseSelector
Epochs: 100%|█████████████████████████| 300/300 [02:51<00:00,  1.75it/s, t_loss=2.125, r_acc=0.998]
2020-11-20 21:26:04 INFO     rasa.utils.tensorflow.models  - Finished training.
2020-11-20 21:26:04 INFO     rasa.nlu.model  - Finished training component.
2020-11-20 21:26:04 INFO     rasa.nlu.model  - Starting to train component FallbackClassifier
2020-11-20 21:26:04 INFO     rasa.nlu.model  - Finished training component.
2020-11-20 21:26:06 INFO     rasa.nlu.model  - Successfully saved model into '/var/folders/h4/l_22gmyj3vngmwgsflbqm9q80000gn/T/tmpqejmf9gn/nlu'
NLU model training completed.
Some layers from the model checkpoint at bert-base-uncased were not used when initializing TFBertModel: ['nsp___cls', 'mlm___cls']
- This IS expected if you are initializing TFBertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPretraining model).
- This IS NOT expected if you are initializing TFBertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
All the layers of TFBertModel were initialized from the model checkpoint at bert-base-uncased.
If your task is similar to the task the model of the checkpoint was trained on, you can already use TFBertModel for predictions without further training.
2020-11-20 21:26:14 INFO     rasa.nlu.components  - Added 'HFTransformersNLP' to component cache. Key 'HFTransformersNLP-bert-1caf5def35c2a17b054c4bd0aff25116'.
2020-11-20 21:26:14 INFO     rasa.nlu.components  - Added 'LanguageModelFeaturizer' to component cache. Key 'LanguageModelFeaturizer-bert-99914b932bd37a50b983c5e7c90ae93b'.
Training Core model...
/Users/xiaoliu/miniconda3/envs/work-env/lib/python3.7/site-packages/rasa/core/policies/form_policy.py:49: FutureWarning: 'FormPolicy' is deprecated and will be removed in in the future. It is recommended to use the 'RulePolicy' instead. (will be removed in 3.0.0)
  docs=DOCS_URL_MIGRATION_GUIDE,
/Users/xiaoliu/miniconda3/envs/work-env/lib/python3.7/site-packages/rasa/shared/utils/io.py:93: UserWarning: It is not recommended to use the 'RulePolicy' with other policies which implement rule-like behavior. It is highly recommended to migrate all deprecated policies to use the 'RulePolicy'. Note that the 'RulePolicy' will supersede the predictions of the deprecated policies if the confidence levels of the predictions are equal.
  More info at https://rasa.com/docs/rasa/migration-guide
Processed story blocks: 100%|█████████████████████| 525/525 [00:00<00:00, 708.86it/s, # trackers=1]
Processed story blocks: 100%|█████████████████████| 525/525 [00:34<00:00, 15.41it/s, # trackers=50]
Processed story blocks: 100%|█████████████████████| 525/525 [00:35<00:00, 14.60it/s, # trackers=50]
Processed story blocks: 100%|█████████████████████| 525/525 [00:38<00:00, 13.55it/s, # trackers=50]
Processed rules: 100%|███████████████████████████████| 1/1 [00:00<00:00, 1463.98it/s, # trackers=1]
Processed trackers: 100%|███████████████████████| 939/939 [00:03<00:00, 284.10it/s, # actions=7742]
/Users/xiaoliu/miniconda3/envs/work-env/lib/python3.7/site-packages/rasa/utils/tensorflow/model_data_utils.py:197: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray
  attribute_features = {MASK: [np.array(attribute_masks)]}
Traceback (most recent call last):
  File "/Users/xiaoliu/miniconda3/envs/work-env/bin/rasa", line 8, in <module>
    sys.exit(main())
  File "/Users/xiaoliu/miniconda3/envs/work-env/lib/python3.7/site-packages/rasa/__main__.py", line 116, in main
    cmdline_arguments.func(cmdline_arguments)
  File "/Users/xiaoliu/miniconda3/envs/work-env/lib/python3.7/site-packages/rasa/cli/train.py", line 90, in train
    nlu_additional_arguments=extract_nlu_additional_arguments(args),
  File "/Users/xiaoliu/miniconda3/envs/work-env/lib/python3.7/site-packages/rasa/train.py", line 55, in train
    loop,
  File "/Users/xiaoliu/miniconda3/envs/work-env/lib/python3.7/site-packages/rasa/utils/common.py", line 308, in run_in_loop
    result = loop.run_until_complete(f)
  File "uvloop/loop.pyx", line 1456, in uvloop.loop.Loop.run_until_complete
  File "/Users/xiaoliu/miniconda3/envs/work-env/lib/python3.7/site-packages/rasa/train.py", line 110, in train_async
    nlu_additional_arguments=nlu_additional_arguments,
  File "/Users/xiaoliu/miniconda3/envs/work-env/lib/python3.7/site-packages/rasa/train.py", line 207, in _train_async_internal
    old_model_zip_path=old_model,
  File "/Users/xiaoliu/miniconda3/envs/work-env/lib/python3.7/site-packages/rasa/train.py", line 263, in _do_training
    or _interpreter_from_previous_model(old_model_zip_path),
  File "/Users/xiaoliu/miniconda3/envs/work-env/lib/python3.7/site-packages/rasa/train.py", line 409, in _train_core_with_validated_data
    interpreter=interpreter,
  File "/Users/xiaoliu/miniconda3/envs/work-env/lib/python3.7/site-packages/rasa/core/train.py", line 67, in train
    agent.train(training_data, **additional_arguments)
  File "/Users/xiaoliu/miniconda3/envs/work-env/lib/python3.7/site-packages/rasa/core/agent.py", line 721, in train
    training_trackers, self.domain, interpreter=self.interpreter, **kwargs
  File "/Users/xiaoliu/miniconda3/envs/work-env/lib/python3.7/site-packages/rasa/core/policies/ensemble.py", line 190, in train
    trackers_to_train, domain, interpreter=interpreter, **kwargs
  File "/Users/xiaoliu/miniconda3/envs/work-env/lib/python3.7/site-packages/rasa/core/policies/ted_policy.py", line 366, in train
    batch_strategy=self.config[BATCH_STRATEGY],
  File "/Users/xiaoliu/miniconda3/envs/work-env/lib/python3.7/site-packages/rasa/utils/tensorflow/models.py", line 184, in fit
    ) = self._get_tf_train_functions(eager, model_data, batch_strategy)
  File "/Users/xiaoliu/miniconda3/envs/work-env/lib/python3.7/site-packages/rasa/utils/tensorflow/models.py", line 426, in _get_tf_train_functions
    train_dataset_function, self.train_on_batch, eager, "train"
  File "/Users/xiaoliu/miniconda3/envs/work-env/lib/python3.7/site-packages/rasa/utils/tensorflow/models.py", line 408, in _get_tf_call_model_function
    tf_call_model_function(next(iter(init_dataset)))
  File "/Users/xiaoliu/miniconda3/envs/work-env/lib/python3.7/site-packages/tensorflow/python/eager/def_function.py", line 780, in __call__
    result = self._call(*args, **kwds)
  File "/Users/xiaoliu/miniconda3/envs/work-env/lib/python3.7/site-packages/tensorflow/python/eager/def_function.py", line 840, in _call
    return self._stateless_fn(*args, **kwds)
  File "/Users/xiaoliu/miniconda3/envs/work-env/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 2829, in __call__
    return graph_function._filtered_call(args, kwargs)  # pylint: disable=protected-access
  File "/Users/xiaoliu/miniconda3/envs/work-env/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 1848, in _filtered_call
    cancellation_manager=cancellation_manager)
  File "/Users/xiaoliu/miniconda3/envs/work-env/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 1924, in _call_flat
    ctx, args, cancellation_manager=cancellation_manager))
  File "/Users/xiaoliu/miniconda3/envs/work-env/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 550, in call
    ctx=ctx)
  File "/Users/xiaoliu/miniconda3/envs/work-env/lib/python3.7/site-packages/tensorflow/python/eager/execute.py", line 60, in quick_execute
    inputs, attrs, num_outputs)
tensorflow.python.framework.errors_impl.InvalidArgumentError:  reshape cannot infer the missing input size for an empty tensor unless all specified input sizes are non-zero
     [[node active_loop_sentence/SparseReshape (defined at /lib/python3.7/site-packages/rasa/utils/tensorflow/layers.py:132) ]] [Op:__inference_train_on_batch_433869]

Errors may have originated from an input operation.
Input Source operations connected to node active_loop_sentence/SparseReshape:
 batch_in_6 (defined at /lib/python3.7/site-packages/rasa/utils/tensorflow/models.py:408)
 SparseTensor_1/dense_shape (defined at /lib/python3.7/site-packages/rasa/utils/tensorflow/models.py:532)

Function call stack:
train_on_batch

Command or request that led to error:

rasa train

Content of configuration file (config.yml) (if relevant):

language: en
pipeline:
# - name: ConveRTTokenizer
# - name: ConveRTFeaturizer
- name: HFTransformersNLP
  # Name of the language model to use
  model_name: "bert"
  # Pre-Trained weights to be loaded
  # model_weights: "hfl/chinese-roberta-wwm-ext"
  model_weights: "bert-base-uncased"

- name: LanguageModelTokenizer
  intent_tokenization_flag: true
  intent_split_symboll: "-"
- name: LanguageModelFeaturizer
- name: RegexFeaturizer
- name: LexicalSyntacticFeaturizer
- name: CountVectorsFeaturizer
  OOV_token: oov
  token_pattern: (?u)\b\w+\b
- name: CountVectorsFeaturizer
  analyzer: char_wb
  min_ngram: 1
  max_ngram: 4
- name: DIETClassifier
  epochs: 200
  ranking_length: 5
- name: DucklingHTTPExtractor
  url: http://localhost:8000
  dimensions:
  - email
  - number
  - amount-of-money
- name: EntitySynonymMapper
- name: ResponseSelector
  retrieval_intent: out_of_scope
  scale_loss: false
- name: ResponseSelector
  retrieval_intent: faq
  scale_loss: false
- name: ResponseSelector
  retrieval_intent: chitchat
  scale_loss: false
- name: FallbackClassifier
  threshold: 0.8
  ambiguity_threshold: 0.1
policies:
- name: TEDPolicy
  max_history: 10
  epochs: 20
  batch_size:
  - 32
  - 64
- name: AugmentedMemoizationPolicy
  max_history: 6
- name: FormPolicy
- name: RulePolicy
  core_fallback_threshold: 0.3
  core_fallback_action_name: action_default_fallback

Content of domain file (domain.yml) (if relevant):

slots:
  budget:
    type: unfeaturized
    influence_conversation: false
...
responses:
  utter_already_subscribed:
...
actions:
- action_default_ask_affirmation
- action_default_fallback
- action_docs_search
- action_explain_sales_form
- action_set_faq_slot
- action_explain_faq
- action_forum_search
- action_get_community_events
- action_greet_user
- action_next_step
- action_pause
- action_set_onboarding
- action_store_bot_language
- action_store_entity_extractor
- action_store_problem_description
- action_store_unknown_nlu_part
- action_store_unknown_product
- action_tag_docs_search
- action_tag_feedback
- sales_form
- subscribe_newsletter_form
- suggestion_form
- respond_chitchat
- respond_faq
- respond_out_of_scope
version: '2.0'

sara-tagger commented 3 years ago

Thanks for the issue, @akelad will get back to you about it soon!

You may find help in the docs and the forum, too 🤗

akelad commented 3 years ago

Hi @XiaoLiuAI, what do you mean by training data is converted from project rasa-demo? Did you manually run the conversion commands? master of https://github.com/RasaHQ/rasa-demo is already converted to 2.0

XiaoLiuAI commented 3 years ago

@akelad I tried to train the model from the data from rasa-demo directly but got lots of error messages, then I converted them manually with the conversion commands. The conversion commands ignored the responses directory under data/nlu. Another possibility could be that I cloned rasa-demo too early and did not get the latest version.

By the way, could you figure out the problem from the error message? It is strange that a well formatted data(may be have some logical problem) lead to such low level error message. That means the validation of runnable data is quite difficult.

akelad commented 3 years ago

yeah i think you need to pull the latest on master, since that's already converted to 2.0.

I am unfortunately not sure where this error is coming from, does it occur every time? And did you try running rasa data validate before training?

XiaoLiuAI commented 3 years ago

@akelad I ran rasa data validate before converting the data. And I once succeed in running rasa training during last week, and failed (3 times) after updating rasa to the latest version in this week. Which is quite confusing.

RasaHQ / rasa

core training failed with data converted from rasa-demo #7324

You may find help in the docs and the forum, too 🤗