Open kenhenry opened 7 years ago
same issue here
Almost the same. I'm just getting "_UNK" :/
Same, any updates?
Could someone solve this issue?
After 10,200 steps... pretty disappointing:
Reading model parameters from working_dir/seq2seq.ckpt-10200
> hello
_UNK _UNK _UNK _UNK _UNK _UNK
> how are you
_UNK _UNK _UNK _UNK _UNK _UNK _UNK
> who are you?
_UNK _UNK _UNK _UNK _UNK _UNK _UNK
Ok so first of all, size_unk had been resolved for me after checking the training data, wich was only a link... I had to get the data from the extra github rep, (check the youtube vid comments for the link) , and process the dialogs with the scripts that came with that.
2nd of all, after reading through the vid comments, i also noticed that people were mentioning that the model was to complex, so in the seq2seq.ini file i reduced the layers to 1 and lowered down from 256 to 128... After multiple days of training at app. 190000 steps, still only gibberish: cheese where where where where why after after after ; or something like that.... Some people mentioned that it was an error of tensorflow 0.12.0 , and that it would read the model incorrectly causing it to train falsely, has anyone tried with 0.10.0?
You should also check the Python version you're using. In my case all the regular expression check were falsified because of Python 3 considering string and bytes differently.
I will push my working code for Python 3 and tensorflow 0.12 on my fork soon.
This problem has already been addressed in this closed issue: https://github.com/llSourcell/tensorflow_chatbot/issues/9
Try the steps listed and hopefully it will resolve your errors.
Good luck!
@Seiwert Hello, I followed instruction from that issue, but still getting _UNK inside of an answer Number of training steps = 120 000
what to do? I can ' t go to the _UNK .
What may be the reason? Thanks
I have the same issue, anyone has a solution?
@HazeHub I'm using 3.5.2 and 0.12.1. I get the same bytes pattern on string error. The code in data_utils only does that if there is no vocabulary in working_dir. If you are ok with dropping in another vocab, you can bypass the error. Just a workaround to be sure. But I think its ok for testing since the vocab isn't part of the enc/dec index.
EDIT: I used the the data folder from @Seiwert. It's weird that even after following the same instructions, my files were different and I originally got perplexity of around 1 using the stock 3 layers x 256. I swapped out the data files and reduced my ini values to 1 x 64 and now my perplexities are "much better" I think. Step 300 = 384 step 600 = 108 step 900 = 70 So i'll let this run for about 30 minutes and then reply back with what this chatbot can do.
EDIT2: "But I think its ok for testing since the vocab isn't part of the enc/dec index." This is pure speculation since I was experimenting using other people's checkpoints while keeping the same data and getting weird results from the chatbot. I went from all responses being _UNK _UNK _UNK _UNK _UNK _UNK _UNK to funny responses.
hello outfit outfit outfit outfit outfit outfit outfit outfit outfit outfit excuse me? embarrassing outfit outfit outfit outfit outfit outfit outfit outfit outfit what? embarrassing outfit outfit outfit outfit outfit outfit outfit outfit outfit what are you saying yelled yelled outfit outfit outfit outfit outfit outfit outfit outfit why are you stupid yelled yelled outfit outfit outfit outfit outfit outfit outfit outfit can you understand me yelled yelled outfit outfit outfit outfit outfit outfit outfit outfit no outfit outfit outfit outfit outfit outfit outfit outfit outfit outfit are you ok embarrassing outfit outfit outfit outfit outfit outfit outfit outfit outfit
If you look at http://suriyadeepan.github.io/2016-06-28-easy-seq2seq/ you can see that the responses are being padded at the end. I think that using someone's checkpoint with different data files creates this mismatched index and the padding gets screwed up. So I'm now running @Seiwert's data files with @Seiwert's vocab files to create my own checkpoint.
EDIT3: After roughly 2 hours I'm at checkpoint 16800.
global step 16800 learning rate 0.4950 step-time 0.26 perplexity 17.71 eval: bucket 0 perplexity 1471.44 eval: bucket 1 perplexity 1514.87 eval: bucket 2 perplexity 1375.03 eval: bucket 3 perplexity 1218.81
Mode=test
Reading model parameters from working_dir/seq2seq.ckpt-16800
hello _UNK . still nothing _UNK . can you speak _UNK . no _UNK . bye _UNK . unknown _UNK .
So it didnt work. But I now have a " ." at the end of every response/decode. Something is wrong I think because I'm not getting garbage. I don't know why yet. However, I will keep going until I get to what some of the others have been citing as benchmarks: 8hrs on AMD-9590FX global step 45,600 perplexity 10.62 ?hrs on '' global step 95,00 perplexity 1
I will note that I'm using 1x64 where others are using 1x128.
@llSourcell haha I just noticed this in seq2seq.ini which could be why the code isn't working for a lot of people.
Mode : train, test, serve
mode = train train_enc = data/train.enc train_dec = data/train.dec test_enc = data/test.enc _testdec = data/test.enc
test decode is trying to use test encode?
@hoomanNo5 Hi I changed it to "test_dec = data/test.dec" but still it gives "_UNK, _UNK" for me. Any help appreciated.
hello _UNK _UNK _UNK _UNK _UNK _UNK _UNK _UNK _UNK _UNK where? _UNK _UNK _UNK _UNK _UNK _UNK _UNK hii _UNK _UNK _UNK _UNK _UNK _UNK _UNK _UNK _UNK _UNK hi _UNK _UNK _UNK _UNK _UNK _UNK _UNK _UNK _UNK _UNK how are you? _UNK _UNK _UNK do you speak? _UNK _UNK _UNK no? _UNK _UNK _UNK _UNK _UNK _UNK _UNK Where? _UNK _UNK _UNK _UNK _UNK _UNK _UNK
@yashkumar6640 is your train.enc and train.dec organized in conversation pairs? On each respective line, enc should be person A and dec should have person B'a response. I used the prepare_data.py script that someone cited earlier.
I didn't take detailed notes on what didn't work but from what I can remember, I had to fix a lot of byte pattern vs string object type errors.
What are your perplexity scores? I'm currently training this bot with different corp....ii? and my global p started at: global step 300 learning rate 0.5000 step-time 0.47 perplexity 1350.04 eval: bucket 0 perplexity 848.75 eval: bucket 1 perplexity 836.90 eval: bucket 2 perplexity 1076.74 eval: bucket 3 perplexity 1305.57 and now it is: global step 6600 learning rate 0.5000 step-time 0.41 perplexity 113.00 eval: bucket 0 perplexity 1083.51 eval: bucket 1 perplexity 2119.63 eval: bucket 2 perplexity 2467.27 eval: bucket 3 perplexity 2210.90
I know global p needs to get down to around 1 to be somewhat readable. This was at 2x256 with 20000w voacb.
hello Mr . Gittes . It's nice to talk to you. You want to talk ? Yes, I do. It's been a while. Some way is it ' s report . Well, I wouldn't put it that way. You have to go leave this , sir ? I don't, I can hang out for a while. For whom ?
This was at 3x512 with 100000w vocab.
hello What ? i'm saying hello YOUNG gives you at basketball , fella giving me barf . that doesn't make much sense It ' s a pain . It never doesn ' t . you sound better than last time Alright , wasn ' t that what you want me . i know, but you still need some work Yeah that ' s . yeah thats true A cup of life ?
Not that great. I probably need a better tokenizer. But I am just a newbie here so I'm experimenting.
@hoomanNo5 Hi, currently I am training it on 1 layer and 128 layers and my perplexity scores are like this. global step 33600 learning rate 0.3699 step-time 1.88 perplexity 1.35 eval: bucket 0 perplexity 1.49 eval: bucket 1 perplexity 1.39 eval: bucket 2 perplexity 1.34 eval: bucket 3 perplexity 1.26 global step 33900 learning rate 0.3699 step-time 1.85 perplexity 1.35 eval: bucket 0 perplexity 1.51 eval: bucket 1 perplexity 1.38 eval: bucket 2 perplexity 1.30 eval: bucket 3 perplexity 1.22 global step 34200 learning rate 0.3699 step-time 1.95 perplexity 1.35 eval: bucket 0 perplexity 1.48 eval: bucket 1 perplexity 1.37 eval: bucket 2 perplexity 1.31 eval: bucket 3 perplexity 1.23 global step 34500 learning rate 0.3699 step-time 2.01 perplexity 1.35 eval: bucket 0 perplexity 1.48 eval: bucket 1 perplexity 1.39 eval: bucket 2 perplexity 1.31 eval: bucket 3 perplexity 1.23 global step 34800 learning rate 0.3699 step-time 1.59 perplexity 1.36 eval: bucket 0 perplexity 1.48 eval: bucket 1 perplexity 1.38 eval: bucket 2 perplexity 1.31 eval: bucket 3 perplexity 1.22 global step 35100 learning rate 0.3662 step-time 2.03 perplexity 1.35 eval: bucket 0 perplexity 1.46 eval: bucket 1 perplexity 1.38 eval: bucket 2 perplexity 1.29 eval: bucket 3 perplexity 1.22
When I stopped it and ran it, I get this result:
hi Moore trifle giving giving giving giving giving trifle trifle trifle how are you? Hendricks Hendricks scrawl teach teach teach teach teach throbbing throbbing please give me some good result Hendricks Hendricks Somerset Somerset Somerset throbbing throbbing throbbing throbbing throbbing throbbing throbbing throbbing throbbing throbbing You are just reading movie character names i guess Hendricks wax wax teach teach teach teach teach teach teach soaked soaked soaked soaked soaked you are funny Hendricks scrawl scrawl scrawl throbbing throbbing throbbing throbbing throbbing throbbing
And yes train.enc and train.dec are organized in conversation pairs. Thanks please see if you can help me in this. I got rid of that _UNK thing.
I read somewhere above in the comments that its maybe due to I downloaded the dataset from somewhere else and the vocab file from here only. You know anything on this if its a problem ?
first of all, lmao at your chatbot's responses. If you read this post: http://suriyadeepan.github.io/2016-06-28-easy-seq2seq/ You can see that the chatbot is designed to bucketize the size of enc/dec strings and pad the leftovers. Somehow "throbbing" got switched with with your pad I think.
As a test, I would re-run the test exactly the same but try to find "throbbing" in your corpus and replace it with a test word. If it changes the output, you know that the problem lies in how your chatbot defines its pad word and your corpus is messed up.
On a side note, I haven't got this chatbot to actually say any intelligent so I'm probably going to move on to another architecture. Probably one where we aren't re-purposing a translator to maintain an illusion of a conversation. But heck, I don't know much about this so I could be absolutely wrong.
I use the default setting to run the program. After 23000 step, when I input something but I only get the "size _UNK ". How can I get a nice reply from chat bot?