turtlesoupy / this-word-does-not-exist

This Word Does Not Exist
https://www.thisworddoesnotexist.com
MIT License
1.02k stars 82 forks source link
gpt-2 machine-learning natural-language-generation natural-language-processing natural-language-understanding transformers

Word Does Not Exist Logo

This Word Does Not Exist

This is a project allows people to train a variant of GPT-2 that makes up words, definitions and examples from scratch.

For example

incromulentness (noun)

lack of sincerity or candor

"incromulentness in the manner of speech"

Check out https://www.thisworddoesnotexist.com as a demo

Check out https://twitter.com/robo_define for a twitter bot demo

Generating Words / Running Inference

Python deps are in https://github.com/turtlesoupy/this-word-does-not-exist/blob/master/cpu_deploy_environment.yml

Pre-trained model files:

To use them:

from title_maker_pro.word_generator import WordGenerator
word_generator = WordGenerator(
  device="cpu",
  forward_model_path="<somepath1>",
  inverse_model_path="<somepath2>",
  blacklist_path="<blacklist>",
  quantize=False,
)

# a word from scratch:
print(word_generator.generate_word())

# definition for a word you make up
print(word_generator.generate_definition("glooberyblipboop")) 

# new word made up from a definition
print(word_generator.generate_word_from_definition("a word that does not exist")) 

Training a model

For raw thoughts, take a look at some of the notebooks in https://github.com/turtlesoupy/this-word-does-not-exist/tree/master/notebooks

To train, you'll need to find a dictionary -- there is code to extract from

After extracting a dictionary you can use the master training script: https://github.com/turtlesoupy/this-word-does-not-exist/blob/master/title_maker_pro/train.py. A sample recent run is https://github.com/turtlesoupy/this-word-does-not-exist/blob/master/scripts/sample_run_parsed_dictionary.sh

Website Development Instructions

cd ./website
pip install -r requirements.txt
pip install aiohttp-devtools 
adev runserver