Closed qhenkart closed 11 months ago
In order to allow multiple Encodings to be initialized concurrently, I did not lock getEncoding()
but only used sync.Once
to ensure the same instance is returned for the same Encoding name.
Now it seems that getEncoding()
is a very expensive operation and perhaps should not be allowed to execute concurrently, as it will allocate too much memory during execution.
@pkoukk thank you so much for the quick response and enhancement. Very cool.
I just want to clarify that with this update, I can generate a single encoding in the main.go file during server initialization and pass it to the handler so that requests and threads created within the request can reuse the same encoding maps concurrently?
Sure, of course you can. You also can initialization tiktoken in go routines. Now 10 threads only need allocate about 10M of space, which is 1/10 of the original.
I have a basic usecase counter, however because GetEncoding is so expensive, my server with 50M of memory gets an OOM error immediately with only about 7 go routines calling it at the same time. It would be great if this was optimized or if the tkm could be shared and re-used