Closed kuriakinzeng closed 6 years ago
Can you provide your pipeline? Also, do you know which spacy model you are using (if English small, medium, or large).
As for point 1 the goal isn't to add in every wording the english language, but rather just enough to pull confidence away form the other 2 intents.
Do you only have the 2 intents? 20k training examples sounds like a lot, maybe even too many. Do you need those for entity training or there really are 20k unique ways to ask the question?
Thanks for the prompt response. I am using medium spacy model. Pipeline-wise, I didn’t specify one—is there a default?
I have 5 other intents with small dataset (e.g. greet, bye, fallback).
I used a dataset generator; the 20k includes questions of the same few formats and vary only by the entity, e.g. “tell me a joke about school” and “tell me a joke about life” where “school” and “life” are the entities.
Hey,
I am experiencing same issues with the german small model. Maybe your reason is that you migth have too many examples, so your ner is overfitting with new words? My ner detects now very well unseen new words. I trained my NLU with just 20 different entities in various sentence structures. But I face the problem that it fails sometimes with longer words which are made up of different words like apfelbaumgarten
in german (garden of apple trees):-)
Thanks @ctrado18
Which of the following are you doing?
Can I look at your training data by any chance? No obligations though :)
I use the first one.
I tested again and find something strange. What might be the reason or is is just because of low training data.
Some words are get misclassified whereas others are classified very good! I use a sentence which was trained on and plug in a not trained entity which is not recognized for some words! Does this relies on the spacy model or do I have tro train more and use more entities?
@wrathagom Here's what I used as the pipeline: "pipeline": "spacy_sklearn"
I saw quite a few pipelines available and one can even design a custom one. Any advice on how I should choose a pipeline?
@kuriakinzeng I have a strong suspicion that @ctrado18 is right on the overfitting and unbalanced intents causing problems.
Generally speaking you don't need very many examples to train intents (especially simple ones with few variants), but can need thousands of examples to train entities. It may be useful for you in your training data to remove the intent label from the vast majority of your training data. If the intent label isn't present then the example will still be used to train the entity recognizer, but wont influence the intent classifier. In this way you could balance the intents...
As far as pipelines, I would stick to the default until it doesn't do something you want. (like duckling)
@wrathagom That's a great suggestion. Thank you. Let me try that and report back my results :)
@wrathagom So, for entities you need very more examples. So is it just norrmal that the entity isnt extracted though you use the same sentence structure as it is trained with?
Yes that makes sense because of how the generalization and training work.
@kuriakinzeng I am closing for now, but please re-open if my suggestion doesn't help!
@wrathagom I tried a smaller and hopefully a balanced dataset available here
Now, instead of being classified under "joke" (chuckNorris in this case) or "advice" intent, it is classified as "greet." Do you have any idea why this is happening?
Perhaps unrelatedly, why is SVM a popular choice for intent classification? Rasa also uses SVM right?
Thanks so much!
@wrathagom I can't re-open this issue. Can you help?
@kuriakinzeng that link doesn't work - can you share your training data a different way? As for the SVM - we use it because it's a simple classifier that we get the best results with. We've tried other classifiers like NNs, with which there's not much improvement
Thanks for looking into it @akelad I have edited my comment above to provide the correct link :)
@ctrado18 idk if you've figured this out yourself yet - but we don't recommend you use the small german spacy model. I know spacy doesn't provide a larger one with spacy 2.0, so I'd suggest downgrading spacy to 1.8.x @kuriakinzeng thanks, i'll get round to looking at it at some point today
@kuriakinzeng i can't see anything obviously wrong with your data, apart from that it's a bit unbalanced (chuckNorris intent has only 13 examples, greet double that). I'd say try running the evaluate script on your data and see where the confusion is happening. and then balance out the data a bit if that shows a lot of intent confusion. Also - are you using the medium spacy model?
Thanks @akelad
Yes to running it with the same dataset. I find it fascinating that you've combined all of my tutorials into a single training set! And terrifying at the same time! 😆
I'm testing it now.
@kuriakinzeng I think it worked though look at the confidence of a request like I want a joke
vs. apple
. In my case, it was .68 vs .44 if you implemented a fallback threshold of .5 you would have something to key off of that the user was asking for an unknown intent.
precision | recall | f1-score | support | |
---|---|---|---|---|
advice | 1.00 | 0.94 | 0.97 | 16 |
chuckNorris | 1.00 | 1.00 | 1.00 | 13 |
fallback | 0.94 | 1.00 | 1.00 | 15 |
goodbye | 1.00 | 1.00 | 1.00 | 38 |
greet | 1.00 | 1.00 | 1.00 | 22 |
marriageProposal | 1.00 | 1.00 | 1.00 | 8 |
avg / total | 0.99 | 0.99 | 0.99 | 117 |
The summary is that this is an interactive approach. And things like feedback loops are essential to fine-tune the end experience.
I would double the number of examples of the advice, chuckNorris, and fallback intents and try again. Let me know what other problems you are seeing.
@akelad thanks. I have not thought about downgrading spacy. I thought they are only small german sets. I struggle to find out the reason that some words (which may not be inside the small data set) are not recognized as entities, although the same sentence structure is used as for training? Can you share your ideas about that? Do I have to add more entity examples? Although I just use 20 different entity examples the NLU detects well unseen entities, but some not...Those untrained are somehow custom words. But I thought that the NLU is able to detect unseen words as entities although they are not in the spacy language model?! Otherwise I would have to use a phrase Matcher for my custom words and a NER comonent would be sensless?
Like sentences: I need a cake
and then tested on I need a cheescake
where cheescake (not part of training) is not recognized.
Thank you! And btw you guys make a great job!!
The models that @akelad mentioned should only impact intent classification. Entities (when using the CRF) don't depend on the model and as you said are more based on sentence structure and text features.
What's an example of a sentence that worked and one that didn't?
@wrathagom Yes, my intent classificarion works fine. It is only about my entities. It is like a type of a sentence I wrote above, but in german. I have not yet run the evualtion script. Maybe I can find out more about the CRF and what is going on?
The docs for the underlying library are here: https://sklearn-crfsuite.readthedocs.io/en/latest/ You can glean some information by looking at the features the Rasa CRF uses: https://sklearn-crfsuite.readthedocs.io/en/latest/
@wrathagom In gitter you have said that words like nop
may not have a vector representation in spacy model, therefore might not be classified. Maybe you refered ther to an other NER than CRF, right? CRF should also work with unknown words because it does not use a vector representation? Although I use german it should not matter for entity recognition which model I use? So I have to give more examples? I definetly will try the evaulation script first now. Afterwards I will try the new NLU with tensorflow. Than you have no entities and I am really excited how this will be better then! 😀
Do you have experienced also this issue with english and crf? This you should have since spacy model plays no important role. Can you give some tips? Do I have to give just more examples?
Hi. Not exactly adding to the topic, but I had a question. @wrathagom
It may be useful for you in your training data to remove the intent label from the vast majority of your training data.
In the JSON format I see that you can just leave the intent field of the dictionary empty. But how would one do that in the markdown format?
## intent:
- hi
- hello
- hey
Leaving intent empty as shown above leads to it being put into the regex_features in the training_data.json file that is created in the model.
I was trying to look at code and figure out how the .md
are converted to .json
but couldn't figure it out.
@wrathagom @akelad Thanks for the effort! Although I can put a threshold of 0.5 to move 'apple' from joke
intent to an unknown intent, I wonder why that is happening? Further, when I try 'orange' it returns 0.88 confidence that it is a greet
intent. It seems strange because I didn't train these terms. Is Rasa/Spacy using a dictionary or word2vec that could explain the relevance of orange to other words in my dataset?
@rithwikjc you really don't want to leave your intent name empty. If you're trying to train an intent that has examples that aren't relevant to your bot, then call it something like out_of_scope
@kuriakinzeng yeah so the sentence representation that is used is an average of word vectors of words in the sentence, and then the sentence is classified using an SVM.
@akelad If I use a german sentence with words that have no vector representation like custom words, does this influence my intent classification? How is this handled in versions before 12.x? Are they just get the value null?
Also with NLU version 12.2 sentence with like How much costs a new bike
get recognized as a whole entity new bike
. but this might reliy on that I have not trained with adjectives in front of the word? I justntrained with sentence like How much costs a glass?
Also, If I train just singular words in my examples and use it with test sentences containg the plural of that word, where the difference is just one letter at the end !, it fails recognizing it. The singular word is recognized although! Why is that now? My NLU detects now very well unseen words, but just the singular form like I trained with and not not the plural form...
NLU is very sensitiv to things like that when you have not trained it with such occassions. One way to think about it is tro write a action which splits the entity in tokens and just ckeck if one of the tokens matches the right entity.
@akelad What I meant was in JSON format training data the intent is an optional field. So @wrathagom was saying it was possible to include examples with out intent for training the entity extractor. I was wondering what is the equivalent for that in Markdown format training data, i.e, for writing examples for entity extraction without providing an intent.
@ctrado18 yes it does influence the intent classification - this also hasn't changed in version 12 if you're using spacy sklearn. if there are enough other relevant words in the sentence however, the sentence should still get classified correctly.
Yes it's most likely to do with your training data.
Is this a problem with entity recognition or intent classification? because if it's for entities, if there's just a single word, there's no context and so the model will sometimes fail to extract words it hasn't seen before. if you have single words a lot you can always use the whole message text as a fallback when an entity isn't recognised.
@rithwikjc we've deprecated just providing entity examples seperately
@akelad I meant with singular the opposite of the plural form of the word like bike vs bikes. That is why I opened another issue with single words as user input like user types just bike
. And there is no context I would like to have it as an entity though. So would it be a good idea to train crf with sentences and single words together?
. if you have single words a lot you can always use the whole message text as a fallback when an entity isn't recognised.
I don't understand this statement.
@akelad How can i find out if specific words are inside the spacy model? Like the verb kaputt
in mein Fahrrad ist kaputt
, so I can use an intent problem
instead of the entity problem
.
you can use the has_vector
method: https://spacy.io/api/token#has_vector
@amn41 does it matters for CRF if the word has a representation? Do you have to care abut all complex grammar stuctures when you use a language like german? Also just for the plural form it gets tricky because the verb form depends on it. e.g. my NLU fails to detect: was kosten gehilfen
, but in singular form it works was kostet gehilfe
. so the form of the verb kosten
is important. Do you have to take care all about this?
I think you don't need to train all verb forms as they ave the same postag?
Both verbs in the german sentence have the same postag: was kostet gehilfe
and was kosten gehilfen
.
But I found something strange. Most of my Nouns are recognised as Verbs or ADJ?
import spacy
nlp = spacy.load('de_core_news_sm')
doc = nlp(u'was kostet eine gehilfe')
for token in doc:
print(token.text, token.lemma_, token.pos_, token.tag_, token.dep_)
I get:
was wer PRON PWS sb
kostet kosten VERB VVFIN ROOT
eine einen DET ART nk
gehilfe gehilfe ADJ ADJA mo
what can I do about that? How can you build a german chatbt with this inaccuracies?
I discovered something very strange. I am developing an own lemmatizer and therefore I needed an own POS Tagger using TIGER for german. After all, I found that the POS-Tagger and also the POS Tagger from spacy are case sensitiv! That's why the above noun gehilfe
is not tagged as a noun but for Gehilfe
it is...Hasn't anyone thought of that? That should be fixed with spacy, right? Since this is very bad!
I don't see where I can fix that?
I mean, capitalization kind of does matter in German when it comes to nouns, given they're all capitalized. When they're not, it's hard to tell whether it's a noun/verb sometimes
But these POS tags shouldn't really cause too much of an issue, given this isn't the only thing ner_crf
pays attention to. I would just make sure you have enough examples in your training data. I've build a chatbot in German before and haven't had huge problems with this.
How many examples per entity/intent do you have in your data? And have you run the evaluate script on your data? Also, as for "if you have single words a lot you can always use the whole message text as a fallback when an entity isn't recognised." - if users respond to questions with single words a lot, you can write a custom action that checks whether an entity was recognised, and if not just grab the whole text they used.
@akelad Thank you! But mostly user write message just with lower cases! Thats why nouns are get mistagged. I tested it and added more examples where they get misstaged such that it works for those cases. But without you would need fewer training data! As I am working for a chatbot it makes sense to fix that? But I don't know how to get off this feature? Could you do that in rasa? I have not much experience with the spacy architecture.
Right now i have 1 entity and 1 intent and have about few hundred sentences with about 30 entity value examples.
Now I understand what you have meant. Isn't that also a good solution handling single words as input at all? You check if no entity is recognised then you check length of message. If it has length of just one token you grap this message! Sounds very good?
What I thought of too is to do a fallback strategy if no entity is found at all (single words or whole sentences). Is it possible to call a pipeline like a phrase matcher conditonally in a custom action? Or generally to to a pipleine on a specific condition not under any circumstances.
@kuriakinzeng did you get your question answered? This thread has gotten a bit out of control 💥
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
This issue has been automatically closed due to inactivity. Please create a new issue if you need more help.
I'm having difficulty to understand why my model behaves in a certain way:
My models are trained with a big but quite homogeneous dataset such as "give me some advice" (intent: advice) or "tell me a joke" (intent: joke). The trained model works very well for similar queries. However, when seeing new phrases and/or words such as "apple" or "banana" that are obviously neither advice or joke, they still get classified as either advice or joke intent with a very high confidence (>85%).
Q: Any idea why this is happening? And do you have any advice how I can do this better?
What I have tried: