Closed SeekPoint closed 8 years ago
I cannot look into the code right now, but with warn_float64=ignore it should not raise this exception.
rzai@rzai00:~/prj/folk-rnn$ cat ~/.theanorc [global] floatX = float32 device = gpu warn_float64=ignore
[nvcc] fastmath = True
I already set warn_float64=ignore
It's in the code: https://github.com/IraKorshunova/folk-rnn/blob/master/train_rnn.py#L17 I tried to run train_rnn and didn't have this error. My Theano==0.7.0 and Lasagne==0.2.dev1
thanks,it works now
I got the out -of-memory on GPU
I am using GTX1080 8G
what's GPU you used?
I had 12Gb. Try with theano flag 'allow_gc=True'
Not work by set allow_gc=False
rzai@rzai00:~/prj/folk-rnn$ CUDA_VISIBLE_DEVICES=1 THEANO_FLAGS='allow_gc=False' python train_rnn.py config5 data/allabcworepeats_parsed
Using gpu device 0: GeForce GTX 1080 (CNMeM is disabled)
float32
config5-allabcworepeats_parsed-20161124-195010
vocabulary size: 12535
train_rnn.py:68: VisibleDeprecationWarning: using a non-integer number instead of an integer will result in an error in the future
valid_idxs = rng.choice(np.arange(ntunes), nvalid_tunes, replace=False)
n tunes: 23636
n train tunes: 22484.0
n validation tunes: 1152.0
min, max length 54 2958
Building the model
/usr/local/lib/python2.7/dist-packages/theano/scan_module/scan.py:1019: Warning: In the strict mode, all neccessary shared variables must be passed as a part of non_sequences
'must be passed as a part of non_sequences', Warning)
number of parameters: 194480456
layer output shapes: #params: output shape:
InputLayer 0 (64, None)
EmbeddingLayer 157126225 (64, None, 12535)
InputLayer 0 (64, None)
LSTMLayer 26723328 (64, None, 512)
DropoutLayer 0 (64, None, 512)
LSTMLayer 2100224 (64, None, 512)
DropoutLayer 0 (64, None, 512)
LSTMLayer 2100224 (64, None, 512)
DropoutLayer 0 (64, None, 512)
ReshapeLayer 0 (None, 512)
DenseLayer 6430455 (None, 12535)
Train model
Error allocating 71696384 bytes of device memory (out of memory). Driver report 22216704 bytes free and 8507162624 bytes total
Traceback (most recent call last):
File "train_rnn.py", line 202, in
HINT: Re-running with most Theano optimization disabled could give you a back-trace of when this node was created. This can be done with by setting the Theano flag 'optimizer=fast_compile'. If that does not work, Theano optimizations can be disabled with 'optimizer=None'. HINT: Use the Theano flag 'exception_verbosity=high' for a debugprint and storage map footprint of this apply node. rzai@rzai00:~/prj/folk-rnn$
and when allow_gc=True?
allow_gc=True is default, also tried, not works
maybe it's because you're creating float64. for me, it takes about 1Gb of GPU memory during training. if you don't find a solution, you can make the model or the batch size smaller
ok, I will try .
rzai@rzai00:~/prj/folk-rnn$ cat ~/.theanorc [global] floatX = float32 device = gpu warn_float64=ignore
[nvcc] fastmath = True
rzai@rzai00:~/prj/folk-rnn$ CUDA_VISIBLE_DEVICES=1 python train_rnn.py config5 data/allabcworepeats_parsed Using gpu device 0: GeForce GTX 1080 (CNMeM is disabled) float32 config5-allabcworepeats_parsed-20161123-204529 vocabulary size: 12535 train_rnn.py:68: VisibleDeprecationWarning: using a non-integer number instead of an integer will result in an error in the future valid_idxs = rng.choice(np.arange(ntunes), nvalid_tunes, replace=False) n tunes: 23636 n train tunes: 22484.0 n validation tunes: 1152.0 min, max length 54 2958 Building the model Traceback (most recent call last): File "train_rnn.py", line 127, in
predictions = nn.layers.get_output(l_out)
File "/usr/local/lib/python2.7/dist-packages/lasagne/layers/helper.py", line 185, in get_output
all_outputs[layer] = layer.get_output_for(layer_inputs, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/lasagne/layers/recurrent.py", line 1050, in get_output_for
strict=True)[0]
File "/usr/local/lib/python2.7/dist-packages/theano/scan_module/scan.py", line 792, in scan
fake_nonseqs = [x.type() for x in non_seqs]
File "/usr/local/lib/python2.7/dist-packages/theano/gof/type.py", line 323, in call
return utils.add_tag_trace(self.make_variable(name))
File "/usr/local/lib/python2.7/dist-packages/theano/tensor/type.py", line 401, in make_variable
return self.Variable(self, name=name)
File "/usr/local/lib/python2.7/dist-packages/theano/tensor/var.py", line 716, in init
raise Exception(msg)
Exception: You are creating a TensorVariable with float64 dtype. You requested an action via the Theano flag warn_float64={ignore,warn,raise,pdb}.
rzai@rzai00:~/prj/folk-rnn$