discordjs / discord.js

A powerful JavaScript library for interacting with the Discord API
https://discord.js.org
Apache License 2.0
25.46k stars 3.97k forks source link

WebookClient/REST memory leak #8733

Open gc opened 2 years ago

gc commented 2 years ago

Which package is this bug report for?

discord.js

Issue description

I was inspecting my bots memory dump to track down whats using memory, and noticed I have 77,000 instances of REST in memory, even though I don't make them myself, nor hold any references to them or webhookclients in my code. (fyi: D.JS makes a REST for the main client, and then a new one for every new WebhookClient). These seem to never be freed from memory

  1. Instantiate some WebhookClient's
  2. Let them fall out of scope, and no longer be referenced
  3. Observe they still use memory.

These images are referenced in the code sample section below

Code sample

Repo/sample: https://github.com/gc/memleakrepo git clone git@github.com:gc/memleakrepo.git yarn && yarn start

After running this, you'll get a heap dump. Open it in chrome, and observe exactly 1000x REST objects in memory being retained, and 0x SomeClass. chrome_iqXzVWA90S

image

Package version

14.6.0

Node.js version

16.9.0

Operating system

Windows

Priority this issue should have

Medium (should be fixed soon)

Which partials do you have configured?

No Partials

Which gateway intents are you subscribing to?

Not applicable (subpackage bug)

I have tested this issue on a development release

No response

gc commented 2 years ago

Commenting discussion with kyra so its documented here:

kyra says the cause is this: https://github.com/discordjs/discord.js/blob/main/packages/rest/src/lib/RequestManager.ts#L240 and the onus is on the user to call .destroy() on webhookclients to free the memory because theres no way to fix it.

I say, surely theres a way to fix it if its just a setInterval - but otherwise lets document it clearly so its known you have to destroy it when you're done to prevent a silent memory leak.

kyranet commented 2 years ago

To give more context, WebhookClient (which extends BaseClient) keeps a REST instance, which sets up those two timers:

https://github.com/discordjs/discord.js/blob/f92be4fb944e69ae1b7cf667d99e434ad2ed7491/packages/rest/src/lib/RequestManager.ts#L240 https://github.com/discordjs/discord.js/blob/f92be4fb944e69ae1b7cf667d99e434ad2ed7491/packages/rest/src/lib/RequestManager.ts#L271

When given the default values:

https://github.com/discordjs/discord.js/blob/f92be4fb944e69ae1b7cf667d99e434ad2ed7491/packages/rest/src/lib/utils/constants.ts#L28-L30

To have short-lived WebhookClient instances, you're required to either:

  1. Call destroy(): https://github.com/discordjs/discord.js/blob/f92be4fb944e69ae1b7cf667d99e434ad2ed7491/packages/discord.js/src/client/BaseClient.js#L38-L41
  2. Define hashSweepInterval and handlerSweepInterval as 0 or Number.POSITIVE_INFINITY via options in the rest property. This will prevent intervals from being created, which should in theory avoid the hard reference. Mind you, this solution is exclusively for throw-away instances, the leak this solution causes for long-lived instances wouldn't matter much for WebhookClient since it keeps very few handlers, but it does for Client.

In theory, both options should be sufficient, please confirm if they both resolve the leak. And we can probably document this, it's a bit niche to have throw-away WebhookClients, but I understand the usage behind them.

gc commented 2 years ago

Calling destroy fixes it, its definitely the intervals. It would be nice if d.js was able to fix this internally, otherwise at the least I think it should be documented somewhere, that you have to destroy them when you're done otherwise they will silently leak memory.

kyranet commented 2 years ago

After some talking in internals, I just realised that WebhookClient is nothing but a replacement of Client for applications that don't manage a Discord bot, but a webhook. That's why both classes extend BaseClient, which is designed to be long-running.

Just like you don't create 1000 Client instances, you shouldn't be creating 1000 WebhookClient instances, but if by chance you do, you should use the destroy() method so they free their own references.

A much better alternative that is most likely your case, is to use a single Client instance and construct 1000 Webhook clients. The difference here is that unlike WebhookClient, Webhooks don't create REST instances or any other self-contained structure, instead, they use the Client's shared REST instance.

However, Webhook's constructor is marked as private:

https://github.com/discordjs/discord.js/blob/f92be4fb944e69ae1b7cf667d99e434ad2ed7491/packages/discord.js/typings/index.d.ts#L2897-L2898

This opens a couple of extra alternatives that we can do on our land:

  1. Make Webhook's constructor public, so we can instantiate them user-land.
  2. Make WebhookClient a real client of Webhook instances, with the ability to create said instances, rather than representing a single Webhook, and obviously sharing the REST instance it has with the sub-instances. This approach is possibly doable as semver-minor since Webhook already supports WebhookClient. Although the added API wouldn't make much sense until a semver-major refactor.