Closed CatarinaSilva closed 4 years ago
Hi, Do you have some very long input lines there?
Maximum has 96 tokens (already broken into final subwords), it doesn't strike me as a super long line
With a mini-batch size of 80 times 96 I could imagine something happening here, especially for a standard transformer where the decoder history will grow and grow. Maybe just reduce the batch size?
Yes, that was my go-to quick fix, reducing the batch size, I was just wondering if there might be something else that I might be missing or some bug in this particular version, in particular because training ran well with the same batch-size, but maybe it didn't have such a long sentence (will check that also).
Thanks anyways :)
After training a model, running marian-decoder crashes after a few log messages with large allocations of memory:
Is this intended or is it some bug? Why does it need so much memory for decoding?
GPU memory is 16GB and so is RAM.