mikeizbicki / cmc-csci181-deeplearning

deep learning course materials
15 stars 6 forks source link

Hyperparms for pt3 #13

Open AlexKer opened 4 years ago

AlexKer commented 4 years ago

It'll be great if you could provide the full list of hyper parameters that were used for the part 3 video tutorial for the GRU. I'm currently trying to replicate the results in tensorboard such that I can generate comparable strings.

mikeizbicki commented 4 years ago

My initial training is done with

$ python3 names.py --train --model=gru --learning_rate=1e-1 --batch_size=10 --gradient_clipping --hidden_layer_size=128 --num_layers=1 --num_samples=100000

and then I decay the learning rate by 10 twice.