Closed Melon-Xu closed 4 months ago
Hi, thank you for your interest in our project! You are right; the bottleneck is the maximum number of elements. The memory consumption grows quadratically as the number increases.
You have several workarounds. (i) reduce the maximum number of elements (ii) reduce batch size (note: you might search for hyper-parameters (e.g., learning rate and number of epochs)) (iii) reduce hidden dimension size in Transformer blocks
Thank you so much! I will try these workarounds.
Very excellent work! I am wondering if the memory occupied is related to the maximum number of elements in the layout.
I apply the model to another custom dataset, and the number of elements in one layout is much larger than 25. The memory is out.
Do I need to reduce the number of elements? Thank you so much!