Closed stefanwebb closed 7 years ago
If you are referring to the iw_vae example (?) the code just runs floor(n_samples / batch_size) batches and averages the loss over those. So if n_samples is not divisible with batch_size the last samples in the (shuffled) train set is just discarded
Ah I'm actually referring to the VIMCO example
In my own code I do,
n_batches = (n_samples + batch_size - 1) / batch_size
and then e.g.
batch_slice = slice(sym_index * sym_batch_size, T.minimum((sym_index + 1) * sym_batch_size, x.shape[0]))
Actually, I see now that it is effectively doing a floor with the integer division, so it isn't a problem
See e.g.: https://github.com/casperkaae/parmesan/blob/master/examples/vimco.py#L318
I was just wondering whether there is a bug when the batchsize doesn't evenly divide the sample set?
For instance, when the batchsize is 24, the last minibatch of the training set on MNIST will be of size 16
I think the code returns an array that is of size (24, eq_samples) for the bound on one epoch, regardless, padded with zeros (not sure how Theano handles this when you input an array with a slice that is out of bounds), so that the bound reported will be slightly smaller than it actually is (being divided by 10008 * eq_samples)
Rather than keep building an array with the bound values, would it not be better to accumulate them in a scalar? See:
https://github.com/casperkaae/parmesan/blob/master/examples/vimco.py#L322