Closed TomReidNZ closed 5 years ago
Hello ,
I think it could be caused by theano config. Seeing the error you are getting, the problem could be the parameter floatX in theano's config is set as float64 when it should be float32.
To check theano's config you can do:
import theano
print(theano.config)
To set a new config for an execution you can do it like this:
THEANO_FLAGS='floatX=float32' python training.py
More information here: http://deeplearning.net/software/theano/library/config.html
If the problem remains the same, could you send a more detailed info about theano's config and the sentences you are passing to the net?
Regards
Hello, Did my answer solve your issue? Thanks!
Can we close this issue? @TomReidNZ did you solve the problem?
Now I get this error
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "training.py", line 6, in <module>
EncTrainer.train()
File "/home/zarmada/pix2story/Lab/source/training/train_encoder.py", line 40, n train
trainer(self.text, self.training_options)
File "/home/zarmada/pix2story/Lab/source/skipthoughts_vectors/training/train.py", line 150, in trainer
cost = f_grad_shared(x, x_mask, y, y_mask, z, z_mask)
File "/home/zarmada/anaconda3/envs/storytelling/lib/python3.5/site-packages/th eano/compile/function_module.py", line 903, in __call__
self.fn() if output_subset is None else\
File "/home/zarmada/anaconda3/envs/storytelling/lib/python3.5/site-packages/theano/gof/vm.py", line 305, in __call__
link.raise_with_op(node, thunk)
File "/home/zarmada/anaconda3/envs/storytelling/lib/python3.5/site-packages/theano/gof/link.py", line 325, in raise_with_op
reraise(exc_type, exc_value, exc_trace)
File "/home/zarmada/anaconda3/envs/storytelling/lib/python3.5/site-packages/six.py", line 692, in reraise
raise value.with_traceback(tb)
File "/home/zarmada/anaconda3/envs/storytelling/lib/python3.5/site-packages/theano/gof/vm.py", line 301, in __call__
thunk()
File "/home/zarmada/anaconda3/envs/storytelling/lib/python3.5/site-packages/theano/gof/op.py", line 892, in rval
r = p(n, [x[0] for x in i], o)
File "/home/zarmada/anaconda3/envs/storytelling/lib/python3.5/site-packages/theano/tensor/elemwise.py", line 790, in perform
variables = ufunc(*ufunc_args, **ufunc_kwargs)
File "/home/zarmada/anaconda3/envs/storytelling/lib/python3.5/site-packages/theano/scalar/basic.py", line 4023, in impl
output_storage = [[None] for i in xrange(self.nout)]
SystemError: <class 'range'> returned a result with an error set
Apply node that caused the error: Elemwise{Composite{Switch(i0, ((i1 * i2) / i3), i2)}}[(0, 2)](InplaceDimShuffle{x,x}.0, TensorConstant{(1, 1) of 5.0}, Elemwis e{Add}[(0, 1)].0, InplaceDimShuffle{x,x}.0)
Toposort index: 741
Inputs types: [TensorType(bool, (True, True)), TensorType(float32, (True, True)), TensorType(float32, matrix), TensorType(float32, (True, True))]
Inputs shapes: [(1, 1), (1, 1), (4800, 20000), (1, 1)]
Inputs strides: [(1, 1), (4, 4), (80000, 4), (4, 4)]
Inputs values: [array([[ True]], dtype=bool), array([[ 5.]], dtype=float32), 'not shown', array([[ 44.09825897]], dtype=float32)]
Outputs clients: [['output']]
HINT: Re-running with most Theano optimization disabled could give you a back-trace of when this node was created. This can be done with by setting the Theano f lag 'optimizer=fast_compile'. If that does not work, Theano optimizations can be disabled with 'optimizer=None'.
HINT: Use the Theano flag 'exception_verbosity=high' for a debugprint and storage map footprint of this apply node.
Hello @TomReidNZ ,
Could you post the whole output you are getting when you execute training.py along some samples of the list of sentences you are passing to the net?
Regards
Hi @ericmcmc, this is the output of the error. We created this text model using these books for the test.
Hi @TomReidNZ and @gsegares,
Seeing the output you get it could be caused by the Theano's version you are using. Could you try updating Theano to 1.0.3 version?
Please, let me know if using the original code with Theano 1.0.3 you can train the models.
Regards
Hi @ericmcmc, we changed the NC6 VM that we were using in Azure to use a deep learning template with all the GPU packages included and we are not getting that error anymore. We also changed the conda file to use the specific version of theano. We are having issues with some missing components in the repo (like paths['v_expansion'] = '../models/GoogleNews-vectors-negative300.bin'
) but that's a different problem. Your suggestion regarding the theano flag to fix the data type mismatch worked. I think we can close this issue. Thanks.
On a new clone of your repo, I can't get the model to train. There's a type mismatch in the updates when building the optimizer.
Running on conda on macOS (using CPU). I didn't mess with any files, just added in the .txt file. I tried updating the n_words, changing various things in the config file but no luck.
Any help would be much appreciated. Thanks, Tom
Error message:
Building optimizers... Traceback (most recent call last): File "/anaconda3/envs/storytelling/lib/python3.5/site-packages/theano/compile/pfunc.py", line 193, in rebuild_collect_shared allow_convert=False) File "/anaconda3/envs/storytelling/lib/python3.5/site-packages/theano/tensor/type.py", line 234, in filter_variable self=self)) TypeError: Cannot convert Type TensorType(float64, matrix) (of Variable Elemwise{add,no_inplace}.0) into Type TensorType(float32, matrix). You can try to manually convert Elemwise{add,no_inplace}.0 into a TensorType(float32, matrix).
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "training.py", line 6, in
EncTrainer.train()
File "/Users/tom/Documents/development/ailab/Pix2Story/source/training/train_encoder.py", line 40, in train
trainer(self.text, self.training_options)
File "/Users/tom/Documents/development/ailab/Pix2Story/source/skipthoughts_vectors/training/train.py", line 128, in trainer
f_grad_shared, f_update = eval(optimizer)(lr, tparams, grads, inps, cost)
File "/Users/tom/Documents/development/ailab/Pix2Story/source/skipthoughts_vectors/encdec_functs/optim.py", line 40, in adam
f_update = theano.function([lr], [], updates=updates, on_unused_input='ignore', profile=False)
File "/anaconda3/envs/storytelling/lib/python3.5/site-packages/theano/compile/function.py", line 317, in function
output_keys=output_keys)
File "/anaconda3/envs/storytelling/lib/python3.5/site-packages/theano/compile/pfunc.py", line 449, in pfunc
no_default_updates=no_default_updates)
File "/anaconda3/envs/storytelling/lib/python3.5/site-packages/theano/compile/pfunc.py", line 208, in rebuild_collect_shared
raise TypeError(err_msg, err_sug)
TypeError: ('An update must have the same type as the original shared variable (shared_var=<TensorType(float32, matrix)>, shared_var.type=TensorType(float32, matrix), update_val=Elemwise{add,no_inplace}.0, update_val.type=TensorType(float64, matrix)).', 'If the difference is related to the broadcast pattern, you can call the tensor.unbroadcast(var, axis_to_unbroadcast[, ...]) function to remove broadcastable dimensions.')