**I've mostly stopped working on this project in favor of my newer implementation neural-chatbot done with TensorFlow.
An implementation of Google's seq2seq architecture.
I am also simultaneously blogging about the process.
First install torch if you haven't already. Here is an easy install process.
You will need to install a few packages to get this to work as well:
$ luarocks install nn
$ luarocks install rnn
Optional (for gpu usage):
$ luarocks install cunn
$ luarocks install cutorch
Or if you prefer AMD,
$ luarocks install clnn
$ luarocks install cltorch
If you get errors, you should try installing these packages as well:
$ luarocks install dpnn
$ luarocks install cunnx
You will need one or more large corpus text files with each line being a conversational phrase. The preceeding line is assumed to be the source, and the following line the target.
To simplify things, my plan is to either include a bash script that downloads a decent sized pre-cleaned corpus, or to actually include the corpus in the data directory. I will do this in the near future, probably after I finished the TODO list above.
Run
$ th train.lua
The dataset is stored in data/raw/
, and comes from my other project opensubtitles-parser