enzoampil / tito-joker

A humorous AI that uses state-of-the-art deep learning to tell jokes
http://35.225.94.177:8501/
GNU General Public License v3.0
45 stars 5 forks source link

Train model2 - with registered additional tokens, lower sequence length, etc #15

Closed enzoampil closed 4 years ago

enzoampil commented 4 years ago

Updates:

  1. Reduce sequence length to 50 (closer to typical joke length)
  2. Double batch size from 2 to 4 (can afford it now due to 1)
  3. Special tokens were formally added to the vocabulary
  4. Still fine-tuned on OpenAI-GPT2 w/ the same dataset (but w/ removed in favor of default end of sequence special token of GPT2 - <|endoftext|>)