Automatic batching? - Githubissues

ZeweiChu commented 7 years ago

Can we add a feature to do automatic batching? We write code for a single instance and PyTorch automatically batch for us?

cc @zou3519

soumith commented 7 years ago

pytorch only supports mini-batches, so rather than write code for single instance (which we dont support) all of our users are encouraged to write code for mini-batches.

jekbradbury commented 7 years ago

In the long term PyTorch will have the components needed to build something like this (in particular, lazy execution and just-in-time graph compilation/optimization). When those prerequisites are ready it would be great to have a TF Fold-like automatic batching capability.

apaszke commented 7 years ago

Yes, we'll be definitely exploring ways of batching and rewriting the graph in the future. TF-fold like functionality will be easy to add, and I think it's only going to be a start.

jekbradbury commented 6 years ago

I'm working on this :slightly_smiling_face:

maor121 commented 6 years ago

Hi, I thought I would contribute an idea taken from another framework: Dynet.

Currently, implementing batching in sequences in Pytorch is a significant pain: 1) Packed sequence alter the input order, it is undocumented in what way exactly, and unless you want to constantly unpack & pack (inefficient), you would have a hard time running operations on it. 2) Masking is slow. 3) Batching sequences of the same length is possible, can be done manually, but damages shuffle randomization which hurts accuracy in the end, unless you use very small batch sizes. I spent ALOT of time on this, it is a significant flow...

Also, if you want to use custom LSTM in Pytorch, and use batching, it is also hard... You have to start diving into the source code.

For those not familiar, dynet implements lazy evaluation. It means you iterate over the data in the loop, process it as you would, and later, on the call to forward. The data is batched together automaticly according to https://arxiv.org/abs/1705.07860

This saves the hassle of organizing the data into batches manually, which is a pain.... My suggestion for implementing this in Pytorch is as follows:

1) Create classes, LazyModule, LazyTensor, LazyVariable 2) LazyModule, on the forward function, will recieve a LazyVariable, which would have the same interface as a Variable. The dimension of the input, will NOT include batch dimension. It would be a single sequence. Operations on the Lazy object will be saved in a stack. To be performed later on the entire batch. 3) On the call to final forward (evaluate\backward..), LazyModule will have all the calls to forward stored somewhere, and will be able to batch the input and run the network on the data.

This will save the hassle of using PackedSequence, masking, or manualy batching sequences of the same length. The idea is taken from dynet, and not my own.

What do you think? Is this possible in Pytorch?

jekbradbury commented 6 years ago

There was some discussion about that possibility when DyNet autobatch was released. What I’m working on now would aim to solve the problems in 1,2,3 without the significant limitations and performance drawbacks of DyNet's approach (e.g. in DyNet the batching process is a graph optimization pass that runs every iteration and often takes longer than executing the batched network).

rohun-tripathi commented 6 years ago

@jekbradbury Is there an update on autobatching?

jekbradbury commented 6 years ago

Yes, hoping to open source soon. Something like this works, allowing x to be either a Variable with batch size 1 or a Batch object where dimension 1 is variable-length; we statically analyze the loops (because different examples need to run through it different numbers of times) and have method overloads for everything else:

class BiRNN(nn.Module):
   def __init__(self, size):
      super().__init__()
      self.fwd = nn.RNNCell(size, size)
      self.bwd = nn.RNNCell(size, size)

   @batch
   def forward(self, x):
      h0 = x.new(1, x.size(-1)).zero_()
      h, fwd = h0, []
      for xt in x.unbind(1):
         h = self.fwd(xt, h)
         fwd.append(h)
      fwd = F.stack(fwd, 1)
      h, bwd = h0, []
      for xt in reversed(x.unbind(1)):
         h = self.bwd(xt, h)
         bwd.append(h)
      bwd = F.stack(reversed(bwd), 1)
      return F.cat((fwd, bwd), 2)

soumith commented 6 years ago

closed via https://github.com/salesforce/matchbox

somnathrakshit commented 4 years ago

salesforce/matchbox does not work with Pytorch 1.5+. Is there an updated way of doing this?

zou3519 commented 4 years ago

Reopening this issue due to @somnathrakshit's comment

pytorch / pytorch

Automatic batching? #830