jzilly / RecurrentHighwayNetworks

Recurrent Highway Networks - Implementations for Tensorflow, Torch7, Theano and Brainstorm
MIT License
403 stars 70 forks source link

Training does not work #12

Closed Avmb closed 7 years ago

Avmb commented 7 years ago

I tried both the Theano and Tensorflow version and neither works:

$ python theano_rhn_train.py with ptb_sota Traceback (most recent call last): File "theano_rhn_train.py", line 22, in from data.reader import data_iterator, hutter_raw_data, ptb_raw_data ImportError: cannot import name hutter_raw_data

$ python rhn_train.py with ptb_sota WARNING - rhn_prediction - No observers have been added to this run INFO - rhn_prediction - Running command 'main' INFO - rhn_prediction - Started Configuration (modified, added, typechanged, doc): batch_size = 20 data_path = 'data' dataset = 'ptb' depth = 10 # the recurrence depth drop_h = 0.25 drop_i = 0.75 drop_o = 0.75 drop_x = 0.25 hidden_size = 830 init_bias = -2.0 init_scale = 0.04 learning_rate = 0.2 load_model = '' lr_decay = 1.02 max_epoch = 20 max_grad_norm = 10 max_max_epoch = 500 mc_steps = 0 num_layers = 1 num_steps = 35 seed = 227745048 # the random seed for this experiment tied = True vocab_size = 10000 weight_decay = 1e-07 ERROR - rhn_prediction - Failed after 0:00:00! Traceback (most recent calls WITHOUT Sacred internals): File "rhn_train.py", line 324, in main reader, (train_data, valid_data, testdata, ) = get_data(data_path, dataset) File "rhn_train.py", line 147, in get_data from tensorflow.models.rnn.ptb import reader ImportError: No module named models.rnn.ptb

Avmb commented 7 years ago

enwiki8_sota and text8_sota also don't work:

TypeError: Expected int32, got list containing Tensors of type '_Message' instead.

jzilly commented 7 years ago

Hi Antonio, Thanks for trying the code. Let me see whether I can help you get it to work. Could you give us some more information about which Theano and Tensorflow versions you are using. What would I need to do to recreate the error from a clean git clone? That way it is easier to help.

For your reference, we also have a branch that is compatible with Tensorflow 1.0. The newest version of Tensorflow might possibly have some compatibility issues which have not been checked yet. Also the Theano issue might be solved by correcting the names of the imports which were changed recently.

Avmb commented 7 years ago

Hi Julian, I used Theano 0.9, but I don't think it's a version issue, since the theano_rhn_train.py script tries to import things from data/reader.py that just aren't there.

For Tensorflow I used version 1.2.0 (the last one on Pip). I've tried both branches of your code but it doesn't work (it tries to import things from Tensorflow contrib module that aren't there).

I'd prefer to work with Theano, is it possible to fix the problem?

jzilly commented 7 years ago

Hi Antonio, Shimi Salant, who made the Theano implementation possible in the first place, was so kind as to fix my code problem and also adjusted the hyperparameters in the code to fit our newest results. I merged his pull request such that the Theano code is now up-to-date on the master branch. It runs on my computer. Please let us know whether that it also fixes your problem. Best, Julian

Avmb commented 7 years ago

I'm running it now and it seems to be working. Thanks a lot!