Try to fix token counting perf.

On our system we're seeing that any calls to num_tokens_from_messages is taking around 300-500ms consistently never mind the size of the message. Which is crazily slow for what it's doing. Upon checking a flamegraph we saw the model was being loaded each time so this attempts to fix that by using the singleton model instances and seeing if that can improve performance.

I'm a bit sceptical on this as the mutex locking might just cause it's own issues but we'll see in our testing and maybe come up with a solution going forwards...

As an aside given the vocab is known up front for a tokeniser you should be able to just codegen a hashmap if you wanted hashmap inserts completely dominate the num_tokens_from_messages.

Experiment in service of #81

zurawiki / tiktoken-rs

Try to fix token counting perf. #82