I was trying to figure out the maximum value of subbatch_size parameter I could use with my GPU hoping that a too big subbatch_size would simply cause a OOM. However, it seems that if subbatch_size gets bigger than the sequence length some weird behaviors take place (like the number of recycling going crazy, ptm score become nan or stuff like that).
My question is: what's the proper way of finding the maximum subbatch_size for a given VRAMsize?
Hi,
I was trying to figure out the maximum value of subbatch_size parameter I could use with my GPU hoping that a too big subbatch_size would simply cause a OOM. However, it seems that if subbatch_size gets bigger than the sequence length some weird behaviors take place (like the number of recycling going crazy, ptm score become nan or stuff like that).
My question is: what's the proper way of finding the maximum subbatch_size for a given VRAMsize?
Thanks,
Leandro.