Closed chenb67 closed 8 years ago
Awesome work! Will give this a try soon.
Did you test it out w/ a few sample conversations? Did it improve or degrade results (objectively)?
I couldn't get the results in the readme in any of the models I tested so far. ( 512 hidden units) However, the version with adam seems to converge faster than the SGD version, and we can use any optimizer now in a plug-n-play fashion. Conversation-wise, I get quite similar results.
Damnit! I let it run over the weekend but forgot to pull your latest changes ...
Good news is it appears to be running a bit faster. 1h30m w/ --dataset 50000 --hiddenSize 1000
. 15min less than before on my machine.
Will update w/ results as soon as I have some.
Got some results after 50 epoch w/: $ th train.lua --cuda --dataset 30000 --hiddenSize 1000
Errors: min= 0.12092084043205
max= 12.942585066745
median= 1.2809126438544
mean= 1.3072453725232
std= 0.34914281377921
ppl= 3.6959786325385
th> say "Hi"
>> Alone tonight?
[0.2976s]
th> say "What is your name?"
>> Victor frankenstein, sir.
[0.2085s]
th> say "How old are you?"
>> Twenty six years.
[0.1879s]
th> say "What color is the sky?"
>> It's a problem.
[0.2599s]
th> say "Are you human?"
>> I'm fine.
[0.1666s]
th> say "Are you a machine?"
>> I've had worse.
[0.2127s]
th> say "Where are you from?"
>> Helsinki,, which is the capital of finland.
[0.3404s]
th> say "How are you doing?"
>> Very well...
[0.1381s]
th> say "Do you like to cook?"
>> Sure.
[0.0989s]
th> say "What is the purpose of life?"
>> It's a game.
[0.2717s]
th> say "What is the purpose of living?"
>> Nothing not.
[0.1161s]
th> say "Hello"
>> You ready?
[0.1244s]
th> say "Hello?"
>> Hello, is this elaine?
[0.2425s]
th> say "No"
>> You were screaming you weren't nick!
[0.3195s]
th> say "No"
>> You were screaming you weren't nick!
[0.3653s]
th> say "I'm not Nick"
>> And don't blame me...
[0.2993s]
th> say "I won't"
>> He's not quite...
[0.2065s]
th> say "what?"
>> I'm trying to stop the
[0.2919s]
th> say "the what?"
>> The little brenner girl.
The training took more memory so I had to drop from 50k to 30k examples. I think this impacted the quality of some answers above (eg.: "What color is the sky?"). But other answers were pretty good!
However, I had to remove the "conversion to CPU before saving" because it took too much memory. Could make this an option and I'll merge.
Correct PPL calculation is so important:) Thank you!
Btw, that is really the coolest name I have ever heard, the machine is claiming he is a little monster:) Haha!
But somehow I am not impressed with this answer: th> say "What is the purpose of life?"
It's a game.
th> say "What is the purpose of living?"
Nothing not.
I hope he can do better...
Can you help to push this changes to main branch? Thank you!
@macournoyer Yes there is an issue in this version we keep a reference to the params. it caused memory increase from after the first epoch. I fix it in my dev branch some time ago.. do you want to just fix it or to add the option anyway?
Chen
@chenb67 ah! If you have a fix, that'd be way better than an option :)
Thanks again for the amazing work @chenb67 !
Thanks for the cool project! @macournoyer I have more features in my dev branch.. would you rather small PRs for every feature or again a pretty big PR?
@chenb67 a big PR like this one is fine with me. Whatever is simpler.
Hi,
In this PR included:
I'm working next on better dataset class, multilayer model and testing seqLSTM instead of just LSTM.
Chen