enzoampil / tito-joker

A humorous AI that uses state-of-the-art deep learning to tell jokes
http://35.225.94.177:8501/
GNU General Public License v3.0
45 stars 5 forks source link

Apply sentiment control (configure sentiment before generation) #8

Closed enzoampil closed 4 years ago

enzoampil commented 4 years ago

Two approaches in mind so far:

  1. During training, append the sentiment score of the joke to the input vector (or the last one), so it can be added as a feature to contextualize the output.

The cool thing about this is that it creates versatility around sentiment, since the value is continuous. I.e., there is a concept of very happy and slightly happy.

E.g.

embedding_vector = [0.1, 0.2, 0.3]
sentiment_score = [0.8]
model_inpute = embedding_vector + sentiment_score = [0.1, 0.2, 0.3, 0.8]
  1. Add sentiment tags to the dataset to correspond to the mood of the joke. These can be implied from actual sentiment / toxicity predictions from pre-trained models.

E.g.

raw_input = "Why did the chicken cross the road?"
processed_input = "<sad> Why did the chicken cross the road?"

The create a dataset with sentiment tags, we can simply reuse existing sentiment analysis models and apply them to each joke in the dataset. We can start off with fine-tuned BERT models for sentiment analysis on full text (example), and then move towards span level controls (example).

A span level implementation will look like below:

E.g.

raw_input = "My dog died today and I am very sad"
output = "My dog died today and <sad> I am very sad </sad>"
enzoampil commented 4 years ago

Can use this model for sentiment spans

https://www.kaggle.com/c/tweet-sentiment-extraction

enzoampil commented 4 years ago

Can use this for toxicity controls

https://www.kaggle.com/c/jigsaw-multilingual-toxic-comment-classification/notebooks

enzoampil commented 4 years ago

Using the model as a reference

https://www.kaggle.com/abhishek/bert-base-uncased-using-pytorch

enzoampil commented 4 years ago

I've decided to start with approach 2 above and will follow the ff steps:

enzoampil commented 4 years ago

I also realized that the same approach as above can be applied to named entities.

enzoampil commented 4 years ago

The long term goal would be to have a both with a set of configurations that allows it to respond to humans in controllable and relatable way.