yjs / y-redis

Alternative backend for y-websocket
GNU Affero General Public License v3.0
176 stars 39 forks source link

Memory Leak Suspected in y-redis during Repeated WebSocket Connections and Disconnections #24

Open totorofly opened 4 months ago

totorofly commented 4 months ago

Checklist

Describe the bug I am observing a potential memory leak in the y-redis module that becomes apparent during repeated connections and disconnections to the same room ID. Memory usage continuously increases with each connect and disconnect cycle, especially when handling large ProseMirror documents over WebSocket connections. This eventually leads to an out-of-memory error in Node.js.

To Reproduce

  1. Set up a WebSocket server with y-redis using the registerYWebsocketServer function.
  2. Implement a setInterval to log memory usage every 10 seconds:
    setInterval(() => {
    const memoryUsage = process.memoryUsage();
    const memoryInMB = {
      rss: (memoryUsage.rss / 1024 / 1024).toFixed(2) + " MB",
      heapTotal: (memoryUsage.heapTotal / 1024 / 1024).toFixed(2) + " MB",
      heapUsed: (memoryUsage.heapUsed / 1024 / 1024).toFixed(2) + " MB",
      external: (memoryUsage.external / 1024 / 1024).toFixed(2) + " MB",
    };
    console.log(`【${new Date().toLocaleString()}】`, "###connections counts###", connections.size, "memory:", memoryInMB);
    // @ts-ignore
    heapdump.writeSnapshot((err, filename) => {
      console.log("Heap dump written to", filename);
    });
    }, 10000);
  3. Use a ProseMirror document larger than 1MB.
  4. Open and close a WebSocket connection from the frontend multiple times to the same room ID.
  5. Observe the memory usage as it increases with each connect/disconnect cycle.

Expected behavior I expected that the memory would stabilize after initial allocations given that the connections are being properly closed on the frontend.

Observed Behavior

The heapUsed metric starts at a few tens of megabytes and continuously climbs with each cycle of connecting and disconnecting to the same room ID. This increase does not stabilize or decrease, even after connections are properly closed, leading to a progressive memory buildup. Eventually, the memory usage surpasses 4GB, resulting in a Node.js out-of-memory error and the termination of the program.

Additional context

I have not yet pinpointed the exact cause but am suspecting issues in how resources are handled upon disconnection in the y-redis setup. I am currently unsure how to further investigate and identify the specific root cause of this memory leak, so I am reporting this issue to seek guidance and possibly get more insights into what might be causing this behavior. Any assistance on how to proceed with diagnosing this problem would be greatly appreciated.

totorofly commented 4 months ago

I tried to debug using Chrome. After establishing a connection with a 1.3MB ProseMirror document for the first time, I took a snapshot. Then, I refreshed this document on the browser page 10 times and took a second snapshot. As shown in the picture:

图片

I found that the biggest difference between the first and second snapshots was that multiple objects corresponding to the ProseMirror document appeared under (string). Upon expanding, I discovered that an object called Awareness occupied a significant size. Awareness corresponds to the import * as protocol from "./protocol.js"; in the ws.js code file. Therefore, I suspected that objects related to Awareness in the protocol were not being released promptly, causing the issue.

I tried adding indexDoc.awareness.destroy(); after executing ws.cork(() => {}. As shown in the screenshot,

图片

I debugged again and found that although the memory usage still existed, it was significantly improved. Before the modification, the heap size at startup was around 15MB, and after refreshing about 10 times, the heap usage would soar to about 120MB. After the modification, the heap size at startup was around 15MB, and after refreshing about 10 times, the heap usage would be around 18MB.

I'm not sure if this modification is appropriate, but it has indeed greatly alleviated the memory usage.

totorofly commented 3 months ago

After 3 weeks of online observation, there are still some memory leaks. After adding aw.destroy(); at the end of mergeMessages in protocol.js, the memory leak problem seems to have been eradicated.

Loki-Afro commented 1 month ago

@totorofly do you think the worker is also affected by this as its has some awareness information which it dosen't really need and is not destroying the awareness info afterwards?

not an expert at all.

agscripter commented 2 weeks ago

I have noticed this behavior as well and it seems to be happening with the Worker code

I deployed both Server and Worker on a docker container in AWS, after running a while with one connection it crashed with the error:

FATAL ERROR: Reached heap limit Allocation failed - JavaScript heap out of memory