Closed majidaldo closed 9 years ago
Yes, this is definitely a problem! I will try adding an axis parameter to the Dataset constructor and then do something intelligent with three-dimensional vs. two-dimensional datasets.
I checked in commit b0b118d which should address this issue. There is a new "axis" keyword argument to the Dataset constructor that allows you to specify the axis of batch splitting. It defaults to 0 for 2D datasets and 1 for 3D datasets.
great! much simpler than the stuff in pylearn2 ;)
i've been out of this code for a while. i can't import from theanets.dataset import SequenceDataset as DS
anymore. has the need for the import been subsumed?
Yes, I don't think you need to import the dataset class at all.
Also, the Dataset class and all of the stochastic optimization routines have moved to https://github.com/lmjohns3/downhill
when the SequenceDataSet is initialized with an array it is broken into minibatches on the first axis. however, when it's given a callable, the data generated from the callable for a RNN is expected to have shape (sequence_length, batch_size, dimension). this creates an inconsistency when SequenceDataSet is initialized.
output