Closed jonasblumer closed 6 years ago
I understand how this is confusing, but it's actually expected behaviour. The synonyms only map to a particular value once they have been recognised as entities. You will still have to add some examples with e.g. covfefe
marked as an entity.
If you're up for creating a PR to make the docs clearer on this that would be 💯
Thank you for the quick reply. May I ask, then, what the point is of defining synonyms by entity_synonyms
? Is it only to get the processor ["ner_synonyms"]
prop in the reply, or are there any other benefits? As far as I can tell, additionally defining entity_synonyms
doesn't change the result of the output when I add the synonyms to common_examples
array anyway to get a match.
I'll gladly update the docs and contribute as soon as I'm clear on the benefits. Thank you!
I am the one that added the note to the docs under the entity synonyms section here.
But I still struggle to explain how this works. In the common_examples
section of the training data if you label a section of the text as an entity then that is fed into training an entity recognition model. Only the examples in the common_examples
section are fed into the model training. So since you only provided examples with an entity value of coffee the model has not generalized that the item entity can have more values than just coffee. When you add the covfefe example into the common_examples
section then it is successfully parsed as an entity by the model.
Once coffee or covfefe are recognized as entity values THEN entity synonyms come into play. In this case they say covfefe is a synonym of coffee so I am going to replace the synonym covfefe with it's defined value coffee.
Said another way expected out put for the request Please have covfefe
:
With entity_synonyms:
{
"entities": [
{
"extractor": "ner_crf",
"end": 19,
"processors": [
"ner_synonyms"
],
"value": "coffee",
"entity": "item",
"start": 12
}
],
"intent": null,
"text": "Please have covfefe",
"intent_ranking": []
}
Notice how the user asked for covfefe, but the entity value returned was coffee, this is because it was processed by ner_synonyms.
Without entity_synonyms
{
"entities": [
{
"extractor": "ner_crf",
"end": 19,
"value": "covfefe",
"entity": "item",
"start": 12
}
],
"intent": null,
"text": "Please have covfefe",
"intent_ranking": []
}
Notice with synonyms the actual parsed entity value of covfefe is returned.
Also @jonasblumer check out https://github.com/RasaHQ/rasa_nlu/issues/773
Thank you for the detailed answers! It does seem to me that the docs could be more specific. So the following two examples will return the same result:
{
"rasa_nlu_data": {
"entity_synonyms": [
{
"value": "coffee",
"synonyms": ["covfefe"]
}
],
"common_examples": [
{
"text": "would like covfefe",
"intent": "order",
"entities": [
{
"start": 11,
"end": 17,
"value": "covfefe",
"entity": "item"
}
]
}
}
this will return a match with value of coffee
because of the entity_synonyms
-mapping. notice that in the common examples, the value is covfefe
.
AND
{
"rasa_nlu_data": {
"common_examples": [
{
"text": "would like covfefe",
"intent": "order",
"entities": [
{
"start": 11,
"end": 17,
"value": "coffee",
"entity": "item"
}
]
},
{
"text": "would like coffee",
"intent": "order",
"entities": [
{
"start": 11,
"end": 17,
"value": "coffee",
"entity": "item"
}
]
}
}
will return the same thing, as the value of both entities is coffee
. no need for using entity_synonyms
here.
In my current understanding, these two examples are absolutely equal.
Is that correct? If yes, I will gladly try to make this more clear in an PR to update the docs.
yes, the entity_synonyms just provides a place where more synonyms can be defined in a smaller space. Granted that there still have to be enough examples in the common_examples section to generalize and recognize them.
@jonasblumer I am going to close this one, but please do submit a PR. Also, let me know if your issue isn't resolved.
The ultimative power of entity synonyms comes together with the prhase matcher! I just played with phrase matcher and did it before NER in the pipleine such that first untrained entities like item are recognized, afterwards cofeve is replaced to coffee with entity_synonyms! And you don'tneed to train cofeve!
Rasa version: Rasa 1.6.0
Rasa SDK version (if used & relevant):
Rasa X version (if used & relevant):
Python version:python3.6.9
Operating system (windows, osx, ...):ubuntu 18.04 LTS
Issue: Failed load nlu model while starting rasa shell to test my bot:
nlu and stories are correct and tested with embedded supervised
![Uploading starter.png…]()
Error (including full traceback):
2020-02-06 21:39:29 INFO root - Connecting to channel 'cmdline' which was specified by the '--connector' argument. Any other channels will be ignored. To connect to all given channels, omit the '--connector' argument.
2020-02-06 21:39:29 INFO root - Starting Rasa server on http://localhost:5005
2020-02-06 21:39:32 INFO absl - Entry Point [tensor2tensor.envs.tic_tac_toe_env:TicTacToeEnv] registered with id [T2TEnv-TicTacToeEnv-v0]
/home/ai/ai/rasa/o/lib/python3.6/site-packages/rasa/nlu/classifiers/embedding_intent_classifier.py:962: UserWarning: Failed to load nlu model. Maybe path '/tmp/tmpwistue_9/nlu' doesn't exist.
f"Failed to load nlu model. "
2020-02-06 21:39:33 INFO rasa.nlu.selectors.embedding_response_selector - Retrieval intent parameter was left to its default value. This response selector will be trainedon training examples combining all retrieval intents.
Bot loaded. Type a message and press enter (use '/stop' to exit):
Your input -> tell me location
2020-02-06 21:39:57 ERROR rasa.nlu.classifiers.embedding_intent_classifier - **There is no trained tf.session: component is either not trained or didn't receive enough training data.**
Your input -> /stop
2020-02-06 21:41:47 INFO root - Killing Sanic server now.
Command or request that led to error:
$ rasa shell
Content of configuration file (config.yml) (if relevant):
# Configuration for Rasa NLU.
# https://rasa.com/docs/rasa/nlu/components/
language: en
pipeline:
- name: "WhitespaceTokenizer"
- name: "RegexFeaturizer"
- name: "CRFEntityExtractor"
- name: "EntitySynonymMapper"
- name: "CountVectorsFeaturizer"
analyzer: "char_wb"
min_ngram: 1
max_ngram: 4
- name: "EmbeddingIntentClassifier"
- name: "ResponseSelector"
# Configuration for Rasa Core.
# https://rasa.com/docs/rasa/core/policies/
policies:
- name: MemoizationPolicy
- name: KerasPolicy
- name: MappingPolicy
Content of domain file (domain.yml) (if relevant):
intents:
- greet
- goodbye
- query_knowledge_base
- bot_challenge
- location_ask
- time_t
- who_ask
entities:
- location
- address
- berlin
- date
- time
- services
actions:
- utter_iamabot
- utter_greet
- utter_goodbye
- utter_ask_rephrase
- action_location
- action_time
templates:
utter_greet:
- text: "Hey!"
- text: "Hello! How can I help you?"
utter_goodbye:
- text: "Bye"
- text: "Goodbye. See you soon."
utter_ask_rephrase:
- text: "Sorry, I'm not sure I understand. Can you rephrase?"
- text: "Can you please rephrase? I did not got that."
utter_iamabot:
- text: "I am a bot, powered by Rasa."
Working with the latest version of rasa_nlu, I'm having a problem where synonyms defined by "entity_synonyms" don't return a match. My training data looks as follows:
When I send
please have coffee
, then anitem
of the valuecoffee
is identified. But when I enterplease have covfefe
, I don't get a match, even thoughcovfefe
is set to be a synonym.BUT if I add training data for "covfefe" like so:
I DO get a match - with
processor ["ner_synonyms"]
.So synonyms do seem to be working, but setting them via a
entity_synonyms
object doesn't work.