Closed nsbradford closed 1 year ago
Hello @nsbradford!
I think the issue here stems from the lack of free()
when creating encoders in a loop. The follow code should be valid.
import { encoding_for_model } from '@dqbd/tiktoken';
for (let i = 0; i < 1000; i++) {
console.log(`Iteration ${i}...`);
const encoding = encoding_for_model('gpt-4'); // this call fails
const result = encoding.encode('Hello, world!');
encoding.free();
}
Future versions will attempt to address the issue by using WeakRefs, so that the encoder will unload itself, but that is not the case at the moment.
Thanks @dqbd - verified free()
works.
(JS) Tiktoken consistently and irrecoverably crashes if you call
encoding_for_model()
too many times. If you have a long-running process, you may need to instantiate encodings many times, and creating a global or passing around a single encoding is not a good solution.Error:
Tiktoken works fine if I instantiate an encoding only once:
Simply catching the error and trying again also doesn't work, as future calls to
encoding_for_model
will either also fail (with same error), or if not callingencode
with it will give you anunreachable
error:Error:
Running on an M2 Macbook Pro. tiktoken-node does not appear to have this issue (though it has a separate crash preventing lots of instantiations https://github.com/ceifa/tiktoken-node/issues/15)