Closed firestonelib closed 3 years ago
Really good question, I'll make sure to update the readme to include a description of all flags. The flag minibatch_size
is used to chunk the larger batch_size
into more manageable chunks for memory. For example, let's say you want to hit a batch size of 1,024, but that won't fit on your GPU. You can split that into smaller batches that will only contain 16 images with the flag --minibatch_size 16
.
This may be renamed to micro batching in future versions for the sake of correctness.
During testing I realized that there is faulty handling of the minibatch_size
I'm updating it to fix the default assignment!
Hello, this conversation is insightful. Thanks.
Why do we need this distinction between minibatch_size
and batch_size
? If the batch_size
does not fit in the GPU to do a training step, why don't we simply make the batch_size
smaller and minibatch_size==batch_size
?
Thank you for your CLIP training code! That's great!
Training with your new commit 8d454de code, I get the following error: RuntimeError: The expanded size of the tensor (0) must match the existing size (8) at non-singleton dimension 0. Target sizes: [0, 1024]. Tensor sizes: [8, 1024]
images_tmp[self.global_rank][jself.minibatch_size:(j+1)self.minibatch_size] = F.normalize(self.model.encode_image(mb), dim=1) minibatch_size = 0 Would you please explain the meaning of mimibatch_size ? How to use minibatch_size?