luheng / deep_srl

Code and pre-trained model for: Deep Semantic Role Labeling: What Works and What's Next
Apache License 2.0
330 stars 77 forks source link

Cannot run deep_srl #4

Closed gkiril closed 6 years ago

gkiril commented 6 years ago

I was following the instructions in the README file. However, when I try to run the interactive console with python python/interactive.py --model conll05_model/ --pidmodel conll05_propid_model, I get the following error:

Embedding size=100
Using 1 feature types, projected output dim=100.
('lstm_0_rdrop', 0.1, True)
Traceback (most recent call last):
  File "python/interactive.py", line 61, in <module>
    pid_model, pid_data = load_model(args.pidmodel, 'propid')
  File "python/interactive.py", line 42, in load_model
    model = BiLSTMTaggerModel(data, config=config, fast_predict=True)
  File "/home/gkiril/Documents/workspace/deep_srl/deep_srl/python/neural_srl/theano/tagger.py", line 41, in __init__
    prefix='lstm_{}'.format(l))
  File "/home/gkiril/Documents/workspace/deep_srl/deep_srl/python/neural_srl/theano/layer.py", line 197, in __init__
    self._init_dropout_layers(input_dropout_prob, recurrent_dropout_prob)
  File "/home/gkiril/Documents/workspace/deep_srl/deep_srl/python/neural_srl/theano/layer.py", line 141, in _init_dropout_layers
    prefix='{}_rdrop'.format(self.prefix))
  File "/home/gkiril/Documents/workspace/deep_srl/deep_srl/python/neural_srl/theano/layer.py", line 402, in __init__
    self.rng = MRG_RandomStreams(seed=RANDOM_SEED, use_cuda=True)
TypeError: __init__() got an unexpected keyword argument 'use_cuda'

This is probably some Theano issue (although I already installed it as suggested in your tutorial).

Any idea of how this can be fixed?

luheng commented 6 years ago

Thanks for reporting this. It could be the difference in Theano version. Looks like they've changed the API in the 1.0 version: http://deeplearning.net/software/theano/library/sandbox/rng_mrg.html

As a quick fix, could you please try removing the "use_cuda" argument? No guarantee for this though.

I'm a bit busy lately, but will look into it when I have more time.

gkiril commented 6 years ago

Thanks for your quick answer.

I tried removing the use_cuda argument. It seems to be working, though some warning is shown on the console:

deep_srl/python/neural_srl/theano/tagger.py:94: UserWarning: theano.function was asked to create a function computing outputs given certain inputs, but the provided input variable at index 2 is not part of the computational graph needed to compute the outputs: <TensorType(int8, scalar)>.
To make this warning into an error, you can pass the parameter on_unused_input='raise' to theano.function. To disable it completely, use on_unused_input='ignore'.
  givens=({self.is_train:  numpy.cast['int8'](0)}))

Suggestion: could be helpful for other people if this note is added to the README.

Cheers!

rakesh-malviya commented 6 years ago

Hi Luheng,

Approximately how much time it was needed to train conll05_model and conll05_propid_model? What was the hardware you used for training ? Or what is ideal hardware we need for training deep_srl ?

Thanks and regards, Rakesh Malviya

luheng commented 6 years ago

I used a Titan X GPU. For the propid model it's about an hour (or less?). for conll05_model it's about a week, but it gets pretty good result after about 24h. Compiling the 8 layer model for the first time (if you use FAST_RUN option) takes about 8 hours due to the variational dropout layer.