lmjohns3 / theanets

Neural network toolkit for Python
http://theanets.rtfd.org
MIT License
328 stars 74 forks source link

ValueError: Shape mismatch when creating batches from 3d array on axis=1 #52

Closed nkundiushuti closed 9 years ago

nkundiushuti commented 9 years ago

I have a recurrent regression architecture in a toy example with the layers [8,10,24]. I am creating a dataset from two numpy arrays with dimensions [40,64,8] and [40,64,24] with batches_size having different values but let's say batches_size=8 and axis=1 (the batches are split on the sequences axis). Training the network gives this error, which does not happen is the split is done on the time axis=0:

File "/home/neuralnets/theanets/main.py", line 246, in train for _ in self.itertrain(_args, _kwargs): File "/home/neuralnets/theanets/main.py", line 315, in itertrain for i, costs in enumerate(opt.train(_sets)): File "/home/neuralnets/theanets/trainer.py", line 162, in train if not self.evaluate(iteration, valid_set): File "/home/marius/neuralnets/theanets/trainer.py", line 116, in evaluate np.mean([self.f_eval(_x) for x in valid_set], axis=0))) File "/usr/local/lib/python2.7/dist-packages/theano/compile/function_module.py", line 588, in call self.fn.thunks[self.fn.position_of_error]) File "/usr/local/lib/python2.7/dist-packages/theano/compile/function_module.py", line 579, in call outputs = self.fn() File "/usr/local/lib/python2.7/dist-packages/theano/scan_module/scan_op.py", line 656, in rval r = p(n, [x[0] for x in i], o) File "/usr/local/lib/python2.7/dist-packages/theano/scan_module/scan_op.py", line 650, in self, node) File "scan_perform.pyx", line 341, in theano.scan_module.scan_perform.perform (/home/marius/.theano/compiledir_Linux-3.13.0-40-generic-i686-with-Ubuntu-14.04-trusty-i686-2.7.6-32/scan_perform/mod.cpp:3573) File "scan_perform.pyx", line 335, in theano.scan_module.scan_perform.perform (/home/marius/.theano/compiledir_Linux-3.13.0-40-generic-i686-with-Ubuntu-14.04-trusty-i686-2.7.6-32/scan_perform/mod.cpp:3505) ValueError: Shape mismatch: x has 64 rows but z has 8 rows Apply node that caused the error: Gemm{inplace}(Dot22.0, TensorConstant{1.0}, <TensorType(float64, matrix)>, W_pool_copy, TensorConstant{1.0}) Use another linker then the c linker to have the inputs shapes and strides printed. Use the Theano flag 'exception_verbosity=high' for a debugprint of this apply node. Apply node that caused the error: forall_inplace,cpu,scan_fn}(Shape_i{0}.0, Subtensor{int64:int64:int8}.0, Alloc.0, W_0, W_pool, InplaceDimShuffle{x,0}.0) Inputs shapes: [(), (40, 8, 8), (40, 64, 10), (8, 10), (10, 10), (1, 10)] Inputs strides: [(), (4096, 64, 8), (5120, 80, 8), (80, 8), (80, 8), (80, 8)] Inputs types: [TensorType(int64, scalar), TensorType(float64, 3D), TensorType(float64, 3D), TensorType(float64, matrix), TensorType(float64, matrix), TensorType(float64, row)] Use the Theano flag 'exception_verbosity=high' for a debugprint of this apply node.

I am working with audio so I have an 1-D array at each time frame which gives the spectrum, however many audio sequences. I realize that the library is under intense development now but any help on this issue would be appreciated.

nkundiushuti commented 9 years ago

this is some code to be tested:

input_data=np.random.rand(40,64,8) output_data=np.random.rand(40,64,24) train=theanets.Dataset(samples=input_data,labels=output_data, batch_size=8, axis=1) e = theanets.Experiment( theanets.recurrent.Regressor, layers=(8,10, 24) ) e.train(train,patience=500)

nkundiushuti commented 9 years ago

I found the problem: , batch_size must be specified when creating the experiment or else it will be initialized with a default value

lmjohns3 commented 9 years ago

Yes, I was just writing that. :)

nkundiushuti commented 9 years ago

anyway, thanks! you are doing an amazing job here. I will test theanets in the future weeks. should I update constantly on the main branch or you recommend to use the release version?

lmjohns3 commented 9 years ago

Thanks! More testing would be great. There's also a mailing list if you'd like to subscribe -- see the README.

The main branch is probably better to use for the time being, because I haven't been good about doing releases often enough.

The main branch might break occasionally, however.