polkadot-js / api

Promise and RxJS APIs around Polkadot and Substrate based chains via RPC calls. It is dynamically generated based on what the Substrate runtime provides in terms of metadata.
Apache License 2.0
1.07k stars 350 forks source link

Memory leaks in long-running processes #5981

Open mfornos opened 4 days ago

mfornos commented 4 days ago

It's known that using the Polkadot.js API library in long-running processes, such as on a Node.js backend, results in increasing memory consumption until the process crashes due to memory leaks.

Below is a summary of the major memory leaks and how to fix them:

1) Well-known memory leak in @polkadot/rpc-core storage cache:
https://github.com/polkadot-js/api/issues/5674#issuecomment-1592854562 Patch: Link

2) Function memoization used across the library via the memo utility:
https://github.com/polkadot-js/api/blob/87bee8ac29bc4b1d882307db556032a1860b9de6/packages/rpc-core/src/util/memo.ts#L20 Further investigation is needed to identify the leaky usage. As a quick fix, patch the memoize function in @polkadot/util:
Patch: Link

3) LRU cache in @polkadot/rpc-provider:
The custom LRU cache implementation is not being properly evicted. Supposedly fixed in https://github.com/polkadot-js/api/pull/4520, but it still leaks in long-running processes. As a workaround, disable the cache:
Patch: Link

Applying these patches will significantly reduce memory consumption and help maintain a stable footprint for your Node.js long-running processes using Polkadot.js.

We would like to investigate these issues in more detail, but due to other priorities, we currently don't have the time.

TarikGul commented 2 days ago

Thanks for all the info above, that is super helpful!

cc: @filvecchiato (Since he has been working closely with performance issues in sidecar and PJS recently).

I think this is definitely something we can tackle very soon and get to the bottom of.