Open samarth12 opened 6 years ago
Hey @samarth12 ,
Can you elaborate a bit ? The dataset is just a JSON file, so you can add new intents by adding new key/value pairs inside the "intents"
dictionary.
So currently based on the training dataset, the classifier gives out the classified result or else it just gives a None intent. Now what I am looking to do is if the output is "None" for any new user input sentence:
Now for both steps 1 and 2, I need to update the dataset with this new input sentence and its corresponding intent in the exact same format as the input dataset in order to retrain the model with updated dataset over time.
I would love to know your thoughts on that @adrienball, please let me know if you still have questions about it.
Hey @samarth12 , That is a reasonable use case indeed. The Snips NLU library will not handle this logic for you though, so, as you said, you will have to write a script to convert the user input into the format used in Snips NLU's dataset.
Cheers, Adrien
So my next question is if I try adding new intents to the dataset using key-value pairs, how do I exactly break the single input and its corresponding intent into the exact format. Like, get the accurate slot name and its corresponding entity as well like the it is done in the dataset. Because I have tried extracting entities from my new inputs using external APIs and then push them in the dataset, that only results in a very bad accuracy!
Again I am not sure if I was able to explain it as well as I wanted to, but thanks for your feedback @adrienball
From my understanding of what you are trying to achieve, that's the user who should provide the labelling (both intent and slots) of new inputs that are not recognized. Only the user knows what his input corresponds to, so only him can specify the intent and the slots. Or maybe I'm missing something?
Okay so the user just provides the right intent (which they think is right) for the new input sentence, the entity and slot should be automatically identified and added to the dataset in the correct format for the corresponding input. How is SNIPS doing that while generating the dataset initially?
Eg: Book me a ticket from LA to NYC.
Intent: BookTicket (User defines this) Entity: location Slot name: departure (LA) and destination (NYC)
Now is there a model that just extracts and defines the entity and slot name (which seems kind of hard), or does that need to be done manually by the user?
Thank you so much for being responsive @adrienball !
@samarth12
Okay so the user just provides the right intent (which they think is right) for the new input sentence, the entity and slot should be automatically identified and added to the dataset in the correct format for the corresponding input. How is SNIPS doing that while generating the dataset initially?
The entity and slot should not be automatically identified in that case. If the user input is not recognized by the NLU, it's likely that it contains atypical data which needs to be completely labeled, i.e. the user must provide both the intent and the slots. This is quite clear in the case where the input corresponds to a new intent, how could you possibly know the slots you are looking for in this input ?
In your example, "Book me a ticket from LA to NYC.", that means the user should provide everything below, and not only the first line about the intent:
Intent: BookTicket
Slot 1:
- slot name: departure
- entity: location
- value: LA
Slot 2:
- slot name: destination
- entity: location
- value: NYC
Yes. thank you! That is something I have been wondering about for a while. But do you think there is any API out there capable of detecting entities as well as their slots based on a pre trained engine that might be helpful, although I know it doesn't make too much sense. Like I pick the entity and the slots from that API, take the intent from the user and append all of that to the SNIPS NLU dataset.
All of it sounds pretty vague but I think you do get the idea of what I am trying to achieve with this, what would you suggest? Because asking the users for the slots and entities might not work on a large scale (too cumbersome), people might just choose to not go through with the process instead. Do you have any other ideas or thoughts on this @adrienball ?
Thanks again for helping me out!
Is there a way I can iteratively update the snips dataset if I want the user to add new intents to the system if something is classified as None?