Closed Sramperu closed 6 years ago
Hi
This is something we are working on improving for future releases. Right now the model has no way of knowing that 'Sydney' alone is a city name unless it is defined in the training set as such, or it's been trained on several similar examples.
Although the current model looks at features like whether the word was capitalized, for instance, it's likely that it has learned that locations often come after prepositions like 'in ' or 'going to ' and may be overfitting to this.
In future releases we will include internal or user-specified lookup tables so that the model will be much better at picking out city names, like 'Sydney', and label them as location entities. We'll also include some tools for creating character ngram features that may help extract custom entities. For now I'd assume it's nothing you did wrong. You could try adding a single Sydney
training example labelled as a location entity and see if it works.
Hope it helps
@twhughes ... Thanks for the reply..... here, Sydney is infact present in the training data but once again when data is trained, I give it as "get me weather in sydney" right? so even if I had trained it, while training the story or conversing with the bot through DMM, in the conversation we dont always mention the location and has to be captured separately.
Now having said this, I went through this example of Weatherbot: https://github.com/JustinaPetr/Weatherbot_Tutorial.git where there is also a video tutorial...! In this video, the scenarios is carried out to find the weather in a specific place. The bot had accepted the location without any preposition identifier....
This caused the confusion if there is anything in the Pipeline I need to alter....
Pls let me know...
Just to be clear, did you add a few examples of one-word inputs with city as an entity? Like:
{
"text": "Canada",
"intent": "inform",
"entities": [
{
"start": 0,
"end": 6,
"value": "Canada",
"entity": "location"
}
]
}
for example? @JustinaPetr said that she included some like this when she did the demo. Since this is a demo one can expect less ideal performance. In general usage you should supply your own training data to cover what use cases you expect and the more examples the merrier.
to answer your question, the pipeline looks fine to me
Hey @Sramperu.
Regarding the entities, I agree with @twhughes - it's actually quite complicated to train the model so that it would know that 'Sydney' alone is a city name. The only thing that might be different about the weatherbot tutorial and the data you have is that in weatherbot tutorial dataset I actually have a few examples of those one-word city inputs and that's why sometimes (but unfortunately not all the time) it extracts the city without a preposition. So just like @twhughes suggested, try adding some examples like this as well.
Hi @twhughes , any update on this please?
@vivekanon No update, sorry.