Use thread local state for high compressor, this can reduce one memory
allocation and free for each compress call to one memory allocation per
thread.
Size of state is roughly 256KB for each thread. Since number of threads normally should not be insanely large, using thread local seems to be acceptable for me. But if other people thinks it is too large, can make it configurable to choose between long term memory footprint and performance. Or extend the interface to either not use singleton instance and save state in compressor instance or allow user to pass in state managed by themselves when calling compress.
Use thread local state for high compressor, this can reduce one memory allocation and free for each compress call to one memory allocation per thread.
Size of state is roughly 256KB for each thread. Since number of threads normally should not be insanely large, using thread local seems to be acceptable for me. But if other people thinks it is too large, can make it configurable to choose between long term memory footprint and performance. Or extend the interface to either not use singleton instance and save state in compressor instance or allow user to pass in state managed by themselves when calling compress.