Open michaelfeil opened 5 months ago
Currently allowing up to batch_size=64 as default. This can potentially lead to high memory usage, e.g. for jina-8k bert -> 64x8192. It would be better to adjust dynamically and set a token budget, e.g. 64*512=32768 per forward pass.
Currently allowing up to batch_size=64 as default. This can potentially lead to high memory usage, e.g. for jina-8k bert -> 64x8192. It would be better to adjust dynamically and set a token budget, e.g. 64*512=32768 per forward pass.