Closed kadir-gunel closed 7 years ago
Hi,
You could half the learning rate when you find the learning curve stop decreasing. You could use plot_curve.py to plot the curve. For example, after 30000 iterations: You could try AdaDelta. We need not to set the learning rate when using AdaDelta, but AdaDelta is much slower than Adam.
The char embeddings could be set in config['enc_embed']
and config['dec_embed']
.
And the word embedding is config['src_dgru_nhids']
which is 512 in my definition.
We don't distinguish test set and dev set in the config file for simplicity, you could set config['test_set']
accordingly.
Your questions are really useful. If you encounter other problems when training, please feel free to raise questions.
Thanks
Thank you.
I noticed something about the GPU memory usage. In the configuration file, I set batch_size first to 56 then to 100 and GPU memory usage did not even change; and also src|trg_seq_char_len are set to 450. Am I skipping something?
After changing batch_size
, you should delete dcnmt_*2*/log
and dcnmt_*2*/iterations_state.pkl
because it will continue training from the checkpoint which uses the previous batch_size
(56) .
Thank you for the fast response. I deleted those files, but still hardly achieving 2.5GB of memory.
How large is the params.npz file? Do you use allow_gc = False
?
I think 2.5GB is too small, usually it takes more than 8GB memory.
This is the Theano flags that I am using : THEANO_FLAGS="on_unused_input=ignore, device=gpu, floatX=float32
I think it is better to try
THEANO_FLAGS="on_unused_input=ignore, device=gpu, floatX=float32, allow_gc = False"
It will speedup training, but consume larger memory. Or you could try cnmem
.
Yep! :+1: Now, it seems normal. Nearly 8GB of memory.
Thanks.
Never mind. Just be careful that the memory may be overflow.
Hello @SwordYork ,
I find it useful to create new threads for unrelated questions hence I created a new one.
In the paper, the learning rate has been changed from 1e-3 to 1e-4. But it is not mentioned in which iteration or what should be the criteria in order to change it. Could you share with me the information?
Also, in the paper, it is written that char embeddings are set to 64 and word embeddings are set to 600. But in the configuration file there is entry for only word embeddings and are set to 64. How can I change the char embeddings?
And lastly, in config file, you meant test set as dev set, right?
By the way, forgive me about the question bombardment :smile: Thank you in advance :+1:
Best Regards Kadir