Closed wiso closed 7 years ago
How many hidden_units/hidden_layers are you using? My implementation is actually pretty slow because when it's creating the layer, it just uses a loop to create the mask. So the number of iterations is roughly the number of parameters. In my notebook, I use 2 hidden layers, 8000 units, about 77M parameters. The training might also be slow because it's 77M parameters. I have a GTX 1070 and it took roughly about 3 hours to train 600 epochs on MNIST.
sorry, I should have been more patient. It tooks 10 minutes at the end (just cpu, don't know if it matters at this stage). I am using the same parameters as in the notebook, without any change.
MaskingDense
function in the notebook example takes ages. Is it expected?keras 2.1.1 tensorflow 1.4.1