Open vflux opened 5 years ago
I also tried to pretrain the generator model. But even with 80000 train steps, the loss doesn't seem to be decreasing and I got poor output. I'd like to ask @iwangjian if you succeed to get the appropriate result, like the one in the original paper, with this code.
Thanks!
I‘ve fixed some bugs and updated the README, thanks for your comments!
I'm getting results like
the the the . the the , the . . the . , the the
the the he the the of the the '' the the to the the`
``
How do I achieve paper like results?How many pretrain steps should I take?
You can directly use the provided discriminator_train_data.npz
to train the full model with pretrain. The full training may be pretty slow.
有没有试过减少迭代次数,比如8万次,这样会有比较好的效果吗
@iwangjian Hey, I am using the 'discriminator_train_data.npz' file and trained for around 650 iterations, but still my output is like this:
b'.'
b'new new : .'
b': .'
b'new : manchester .'
b': : .'
b'.'
b': v .'
b'.'
b'.'
b'new .'
b': may .'
b': manchester manchester .'
b'new'
Commands that I am running: python3 main.py --mode=train --data_path=./data/train.bin --vocab_path=./data/vocab --log_root=./log --pretrain_dis_data_path=./data/discriminator_train_data.npz --restore_best_model=False
For testing: python3 main.py --mode=decode --data_path=./chunked/test_001.bin --vocab_path=./data/vocab --log_root=./log --single_pass=True
chunked data is generated using the same method that you have mentioned in your repo.
Thanks in advance.
Never mind, I was doing a mistake. If I am using discriminator_train_data.npz, then the train command will be this:
python3 main.py --mode=pretrain --data_path=./data/train.bin --vocab_path=./data/vocab --log_root=./log --pretrain_dis_data_path=./data/discriminator_train_data.npz --restore_best_model=False
@iwangjian 请问问题如何解决,我是使用了discriminator_train_data.npz,进行full train,得到decode结果也是这样,比较不好 the the the . the the , the . . the . , the the 感谢您的回复
I'm trying to run the code on the CNN/Dailymail dataset following the instructions in the readme but the loss doesn't seem to be decreasing and when I try to decode I get output like the following. This is after over 1000000 training steps.
Do you have any idea where the problem could be?