Closed tmbo closed 5 years ago
@znat wrote:
Would be really great to have that and I'm interested in contributing. For example, I like the way recast.ai adds sentiment to any query.
This is to start the discussion: what should it be? Should it be a (1) pre-trained per language model with state of the art results like models trained on the Stanford sentiment tree bank, or a (2) trained in Rasa model? (a sentiment label would be added to each example fed to Rasa at training time).
(1) would probably complicate the stack a lot (recursive networks not well supported in DL frameworks) but there a not state of the art models that are is still good and easy to implement in Keras. But does it make sense to add DL to the Rasa stack?
(2) seems easier as I suppose decent results on small datasets can be obtained with algos supported by sklearn. Probably lower accuracy but it's also domain specific so it may be good enough.
What are your thoughts?
I think ideally and in the feature we will go for both, due to the modular architecture we can support both.
I am not sure how much data a reasonably useful model needs to to achieve stable results. Hence, I think starting with a pre-trained model might be better.
1) There is no issue with introducing new dependencies (it will be an optional dependency for people who want to use this part of the framework). I would avoid small DL frameworks though, and if there is something that is nearly as good but simpler I'd always go for that (in practice it is ok to get 1.0% less accuracy but have a way simpler model and shorter training times).
2) Completely agree, would need to be evaluated though (if a simple domain specific model outperforms komplex pre-trained model).
All said, I'd really like this to be part of the framework, no matter which of the two solutions it is.
Currently I'm trying to add a sentiment analysis component into rasa NLU and build a bot with rasa core who can resopnes to pos/neg sentiments accordingly. I've got 2 thoughts also:
1) add the learned sentiment as a slotted entity, and code the actions with diff response to diff sentiment. This looks simple while could work/ 2) feed the sentiment to the featurizer and add it to the dialogue training / response predicting. Then I would be doubting would the sentiment feature be lost in the vast majority of other NLP features, thus my has little impact on the prediction. But indeed this is the "AI" way of doing that.
What do you think of that?
The reason I never followed up on this issue is that I came to the conclusion that it was not a good idea. At this time at least, no accessible model was accurate enough to be used on a conversation. It's not a problem to be wrong sometimes when you need agrregate results on 1000s of tweets, but if you're wrong in a conversation, 100% of your conversation is a failure.
Plus, available datasets seemed really too biased to be applied to a general conversation model, and simple models are just not good enough.
In the end, training intents that convey the idea of frustration turned out easy and reliable, and that's my recommendation to those trying to use sentiment to shape a conversation
@amn41 i'd say this issue isn't relevant any more right?
Create a component that adds sentiment analysis of the messages.
E.g. using spacy there is already an example available that does that (https://explosion.ai/blog/spacy-deep-learning-keras#deep-learning)