ceifa / tiktoken-node

OpenAI's tiktoken but with node bindings
103 stars 10 forks source link

Memory leak with multiple calls to encodingForModel #15

Closed nsbradford closed 1 year ago

nsbradford commented 1 year ago

The following code with many calls to encodingForModel will rapidly fill all the available memory and get killed. If you have a long-running process, you may need to instantiate encodings many times, and creating a global or passing around a single encoding is not a good solution:

import { encodingForModel } from 'tiktoken-node';

for (let i = 0; i < 1000; i++) {
  console.log(`Iteration ${i}...`);
  const encoding = encodingForModel('gpt-4');
  const result = encoding.encode('Hello, world!');
}

Error:

error Command failed.
Exit signal: SIGKILL

dqbd/tiktoken does not appear to have this issue (though it has a separate crash preventing lots of instantiations https://github.com/dqbd/tiktoken/issues/35)

ceifa commented 1 year ago

thx for reporting, will take a look this weekend

ceifa commented 1 year ago

Just checked it and there is no memory leak at all. What is happening is that encodingForModel loads a large object into the memory, and because the entire for runs on the same tick, there is not space for the GC to collect it.

One way to mitigate the memory issue is to reuse the same encoding object in each iteration of the loop rather than creating a new one each time.

import { encodingForModel } from 'tiktoken-node';

const encoding = encodingForModel('gpt-4');

for (let i = 0; i < 1000; i++) {
  console.log(`Iteration ${i}...`);
  const result = encoding.encode('Hello, world!');
}

Why do you think that create a global is not a good solution? Caching heavy objects it's a good way to improve memory/process usage.

nsbradford commented 1 year ago

Hi @ceifa - thanks for your response, and sorry for the late reply. How did you verify that there's no memory leak if you run in different ticks? When I try various methods to try to force new ticks, I still get the memory accumulation and SIGKILL. For example:

  it('should not crash when instantiated many times with different ticks', async () => {
    for (let i = 0; i < 1000; i++) {
      console.log(`Iteration ${i}...`);
      const encoding = encoding_for_model(modelName);
      encoding.encode('Hello, world!');
      await Promise.resolve(); // force new tick
      if (global.gc) {
        global.gc();
      } 
    }
  });

also - do you have any references for GC not being able to run inside of a single tick? I've seen some conflicting info, for example this StackOverflow answer gives the impression GC should be able to run at any time

ceifa commented 1 year ago

I'm not sure about the internals of V8/NodeJS. If you visualize it on devtools, you can see that it will be collected later, it can be something with how GC analyzes what should be collected or an implementation behavior on napi-rs (the package I use to make the node module).