ctn-archive / nengo_theano

ABANDONED; see https://github.com/nengo/nengo instead
MIT License
3 stars 3 forks source link

Cannot compute test value error when using GPU implementation of theano. #16

Closed xchoo closed 11 years ago

xchoo commented 11 years ago

The following error is thrown when trying to use the GPU (CUDA) implementation of theano.

Using gpu device 0: GeForce GTX 280 starting simulation Traceback (most recent call last): File "c:\Program_Files\Python27\lib\runpy.py", line 162, in _run_module_as_mai n "main", fname, loader, pkg_name) File "c:\Program_Files\Python27\lib\runpy.py", line 72, in _run_code exec code in run_globals File "d:\fchoo\Documents\GitHub\nef-py\nef\test\test_array.py", line 43, in <m odule> net.run(timesteps*dt_step) File "nef\nef_theano\network.py", line 496, in run self.theano_tick = self.make_theano_tick() File "nef\nef_theano\network.py", line 482, in make_theano_tick return theano.function([], [], updates=updates) File "c:\Program_Files\Python27\lib\site-packages\theano\compile\function.py", line 221, in function profile=profile) File "c:\Program_Files\Python27\lib\site-packages\theano\compile\pfunc.py", li ne 484, in pfunc no_default_updates=no_default_updates) File "c:\Program_Files\Python27\lib\site-packages\theano\compile\pfunc.py", li ne 202, in rebuild_collect_shared update_val = store_into.type.filter_variable(update_val) File "c:\Program_Files\Python27\lib\site-packages\theano\sandbox\cuda\type.py" , line 147, in filter_variable return theano.sandbox.cuda.basic_ops.GpuFromHost()(other) File "c:\Program_Files\Python27\lib\site-packages\theano\gof\op.py", line 401, in call raise ValueError('Cannot compute test value: input %i (%s) of Op %s missing default value' % (i, ins, node)) ValueError: Cannot compute test value: input 0 (Elemwise{Cast{float32}}.0) of Op GpuFromHost(Elemwise{Cast{float32}}.0) missing default value

This error is only thrown when using the GPU option (i.e. device = gpu in .theanorc file).

The cause of this error is the "theano.config.compute_test_value = 'raise'" line of code in network.py. If this line of is not needed, it should be removed.

studywolf commented 11 years ago

done and done, I think I had just thrown that in there earlier at some desperate point of trying to debug.

xchoo commented 11 years ago

Righo. I'm going to do some proper benchmarking tests between the CPU and GPU implementations. The preliminary tests don't look good though. The GPU implementation of theano seems to run about twice as slow as the CPU implementation. Although, bigger models or longer run times might see a benefit in the GPU implementation.

jaberg commented 11 years ago

I don't know exactly what you're running as a benchmark, but I would say, probably don't worry too much if the initial results are disappointing. There a lot of things that affect speed, and there may well be a few not-too-difficult but important changes to make in Theano to get good performance for this type of computation.

Part of the story is the NEF itself - the dot products that produce the semantic vectors are not necessarily a good use of a GPU, especially the way Theano currently represents them. It might be better to merge encoders and decoders into full weight matrices, unless we can organize the computation to involve the computation of a few thousand semantic vector components in parallel, in which case the NEF becomes a big win again. We can talk about these things in the context of a particular model to get a sense of what the bottlenecks are and how to get past them.

On Thu, Mar 21, 2013 at 10:49 PM, xchoo notifications@github.com wrote:

Righo. I'm going to do some proper benchmarking tests between the CPU and GPU implementations. The preliminary tests don't look good though. The GPU implementation of theano seems to run about twice as slow as the CPU implementation. Although, bigger models or longer run times might see a benefit in the GPU implementation.

— Reply to this email directly or view it on GitHubhttps://github.com/ctn-waterloo/nef-py/issues/16#issuecomment-15278461 .

xchoo commented 11 years ago

Yup. Sounds like a plan. Im currently just running the test code in the nef-py directory, but those simulations are 1 - 2 seconds long. I figure as the simulation time gets longer, the GPU might win over the CPU. And as you mentioned, retweaking the NEF implementation to work with theano and the GPU should give us better performance.


Xuan Choo HP: (226) 339 3892

On Thu, Mar 21, 2013 at 11:02 PM, James Bergstra notifications@github.comwrote:

I don't know exactly what you're running as a benchmark, but I would say, probably don't worry too much if the initial results are disappointing. There a lot of things that affect speed, and there may well be a few not-too-difficult but important changes to make in Theano to get good performance for this type of computation.

Part of the story is the NEF itself - the dot products that produce the semantic vectors are not necessarily a good use of a GPU, especially the way Theano currently represents them. It might be better to merge encoders and decoders into full weight matrices, unless we can organize the computation to involve the computation of a few thousand semantic vector components in parallel, in which case the NEF becomes a big win again. We can talk about these things in the context of a particular model to get a sense of what the bottlenecks are and how to get past them.

On Thu, Mar 21, 2013 at 10:49 PM, xchoo notifications@github.com wrote:

Righo. I'm going to do some proper benchmarking tests between the CPU and GPU implementations. The preliminary tests don't look good though. The GPU implementation of theano seems to run about twice as slow as the CPU implementation. Although, bigger models or longer run times might see a benefit in the GPU implementation.

— Reply to this email directly or view it on GitHub< https://github.com/ctn-waterloo/nef-py/issues/16#issuecomment-15278461> .

— Reply to this email directly or view it on GitHubhttps://github.com/ctn-waterloo/nef-py/issues/16#issuecomment-15278778 .

hunse commented 11 years ago

I don't think it's going to be simulation time so much as simulation size. The GPU excels at doing tons of computations in parallel. So if we can have all our neurons do their updates each time step as part of one huge op, that's where the GPU will win.