SeleniumHQ / selenium-ide

Open Source record and playback test automation for the web.
https://selenium.dev/selenium-ide/
Apache License 2.0
2.73k stars 739 forks source link

EventEmitter memory leak causing Selenium IDE crashing #1763

Closed edvinjunda closed 4 months ago

edvinjunda commented 5 months ago

🐛 Bug Report

When running test scripts for about 20-30 min with "Run" command, some memory leaks related with EventEmitter begin to appear and Selenium IDE starts lagging. Eventually after more than 30 min Selenium IDE window becomes white and window completely stops responding, some time after Selenium IDE crashes. When screen is white errors about WebFrameMain access pop up all the time in logs window and console window, if application was ran via it, until crash.

To Reproduce

Use "Run" command in cycles, which would run some script over and over again for longer time, better if different scripts are ran.

Expected behavior

I suppose program shouldn't crash and should be able to handle script running with "Run" command many times within 1 test.

Logs

Example of memory leak error from Selenium IDE logs:

[2024-01-08 16:57:45.433] [error] (node:17880) MaxListenersExceededWarning: Possible EventEmitter memory leak detected. 11 state.onMutate.response listeners added to [IpcMainImpl]. Use emitter.setMaxListeners() to increase limit

End of logs from console during the crash:

14:41:38.027 > Error sending from webFrameMain:  Error: Render frame was disposed before WebFrameMain could be accessed
    at WebFrameMain.s.send (node:electron/js2c/browser_init:2:91758)
    at WebContents.send (node:electron/js2c/browser_init:2:78126)
    at C:\Users\User\AppData\Roaming\npm\node_modules\selenium-ide\build\webpack:\src\main\api\classes\EventListener.ts:107:16
    at new Promise (<anonymous>)
    at fn (C:\Users\User\AppData\Roaming\npm\node_modules\selenium-ide\build\webpack:\src\main\api\classes\EventListener.ts:102:55)
    at map (C:\Users\User\AppData\Roaming\npm\node_modules\selenium-ide\build\webpack:\src\main\api\classes\EventListener.ts:40:62)
    at Array.map (<anonymous>)
    at Object.dispatchEvent (C:\Users\User\AppData\Roaming\npm\node_modules\selenium-ide\build\webpack:\src\main\api\classes\EventListener.ts:40:50)
    at transFn (C:\Users\User\AppData\Roaming\npm\node_modules\selenium-ide\build\webpack:\src\main\log.ts:23:30)
    at i.processMessage (C:\Users\User\AppData\Roaming\npm\node_modules\selenium-ide\build\webpack:\node_modules\.pnpm\electron-log@5.0.1\node_modules\electron-log\src\core\Logger.js:168:11)
    at i.logData (C:\Users\User\AppData\Roaming\npm\node_modules\selenium-ide\build\webpack:\node_modules\.pnpm\electron-log@5.0.1\node_modules\electron-log\src\core\Logger.js:124:10)
    at console.<computed> [as error] (C:\Users\User\AppData\Roaming\npm\node_modules\selenium-ide\build\webpack:\node_modules\.pnpm\electron-log@5.0.1\node_modules\electron-log\src\core\Logger.js:79:37)
    at WebFrameMain.s.send (node:electron/js2c/browser_init:2:91789)
    at WebContents.send (node:electron/js2c/browser_init:2:78126)
    at C:\Users\User\AppData\Roaming\npm\node_modules\selenium-ide\build\webpack:\src\main\api\classes\EventListener.ts:107:16
    at new Promise (<anonymous>)

<--- Last few GCs --->

[14116:000017A400144000]  2842676 ms: Scavenge 3721.2 (4086.5) -> 3716.7 (4087.0) MB, 28.18 / 0.00 ms  (average mu = 0.391, current mu = 0.382) allocation failure;
[14116:000017A400144000]  2842725 ms: Scavenge 3722.7 (4087.0) -> 3718.4 (4087.3) MB, 17.62 / 0.00 ms  (average mu = 0.391, current mu = 0.382) allocation failure;
[14116:000017A400144000]  2842758 ms: Scavenge 3724.2 (4087.3) -> 3719.9 (4087.8) MB, 10.02 / 0.00 ms  (average mu = 0.391, current mu = 0.382) allocation failure;

<--- JS stacktrace --->

FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory
 1: 00007FF7881E4BF6 node::SetTracingController+86038
 2: 00007FF7881E4E69 node::OnFatalError+585
 3: 00007FF78B3A1453 v8::Function::NewInstance+835
 4: 00007FF78B3A13E3 v8::Function::NewInstance+723
 5: 00007FF78B464AC7 v8::CppHeap::wrapper_descriptor+66519
 6: 00007FF786CDF9C8 v8::CppHeap::Terminate+91560
 7: 00007FF786CDE0D6 v8::CppHeap::Terminate+85174
 8: 00007FF786CE54DB v8::CppHeap::Terminate+114875
 9: 00007FF786CE5324 v8::CppHeap::Terminate+114436
10: 00007FF786F32B05 cppgc::internal::WriteBarrier::SteeleMarkingBarrierSlow+2677

Project file reproducing this issue (highly encouraged)

Dummy project with test created specifically to reproduce the issue (extension is changed to json to be able to upload it to GitHub). Run test "new test main". Issue was found with more complex project, but structure of test in provided dummy project is similar. In "While" command its written 8000, but at least for me value wasn't reaching 3k before crashing dummy tests.json

Environment

OS: Windows 10 Selenium IDE Version: 4.0.1-alpha.75-windows or higher

toddtarsi commented 5 months ago

I love this issue, yeah I've been having this problem for a while, but having an issue about it will give me the fire I need to resolve it. Thank you for reporting! I'll try and have some progress here this week. Thank you for the repro side file ❤️

toddtarsi commented 5 months ago

I just wanted to come back here and say that the side file you provided is an absolute banger 🥂

toddtarsi commented 5 months ago

I think I've got this resolved. I changed your test script to give me the time differential after running test case 3. It's looking good, but it's only around 1200 atm. Should be at 8k within the hour, and then I'll be able to let you know. But definitely better event lifecycle handling.

toddtarsi commented 5 months ago

@edvinjunda - Please try on latest. I think it's a lot better, I can get to around 7500 without issue. Beyond this, I probably need to look at improving react performance by breaking apart my session shape into contexts. Before I do that, I'd love to just see your thoughts with some of those listeners resolved.

edvinjunda commented 5 months ago

@edvinjunda - Please try on latest. I think it's a lot better, I can get to around 7500 without issue. Beyond this, I probably need to look at improving react performance by breaking apart my session shape into contexts. Before I do that, I'd love to just see your thoughts with some of those listeners resolved.

Sure, I will check

toddtarsi commented 5 months ago

@edvinjunda - Okay, it was better. It's not so better. I added a setTimeout to let things free up, but then it caused some of the unit tests to fail and I had to back it back out. I'll maybe look at splitting the difference here with setImmediate or w/e

toddtarsi commented 5 months ago

@edvinjunda - For now though, you can clear the event loop by using a delay setting. Should be a project config setting. I think a value of just 1ms would really help with memory management.

edvinjunda commented 5 months ago

@toddtarsi Memory leaks don't appear anymore, but UI still eventually stops responding, although it takes more time until that now. I think it is memory problem, because in task manager I see that Selenium loads processor and memory with waves when I run my project test. Currently I'm playing around with delays. I guess you could try improving react performance.

toddtarsi commented 5 months ago

@edvinjunda - I agree. I actually had your test easily getting up to like 7k before the slow down really hit. That should be how it handles if you turn that delay up to 1ms. I think I need to take time this week to debug the core of the side-runtime or maybe more side-runner cases. It's feeling brittle and sometimes a hang sneaks in. I hate to say this, but I think I like the version of the side-runner about 16 versions back more.

edvinjunda commented 5 months ago

I tried to run provided dummy tests with 1 ms delay and Selenium crashed after almost reaching 5k, then increased delay to 100ms and Selenium crashed after almost reaching 1.3k, then I tried with 1 ms again and that time it managed to get to 8k, but logs in log window were unable to catch up and during the time it was over, in logs window it was printed about 2k and even after few hours it didn't manage to print all what was logged in log files. I also had opened other programs during that time. Overall it is better now, but crashes and lags remain.

edvinjunda commented 5 months ago

It started printing logs fast again after clearing logs window and then UI crashed (became white), but Selenium kept working.

toddtarsi commented 5 months ago

@edvinjunda - Thank you for testing and the side file. I'm going to continue to refine this week. My goal's this week are first an issue where certain tests can rarely hang if they hit just the right issue at high parallel execution, and then this.

toddtarsi commented 5 months ago

@edvinjunda - Okay, pushing out a release now that deals with some of the hangs. The IDE handles playing suite windows better which might help a bit with preventing the IDE from going white, but I still need to do a lot more to figure out performant ways of doing writes on the logger component.

toddtarsi commented 4 months ago

@edvinjunda - I added back a set timeout clause when its retrying stuff. I found in some cases (playback of an empty click for example), this would lock the entire resource set until the binary exploded 😬 . This won't help in the big case, but makes things a bit better in the near term.

edvinjunda commented 4 months ago

Maybe separate issue should be created relating selenium long-term stability problems since event emitter leaks are fixed?

edvinjunda commented 4 months ago

Regarding stability, I ran dummy tests script and managed to reach 8k, but logger was lagging a lot and it reached only 2k+ in logs and UI froze completely somewhere at 2k-4k. Before selenium would crash after first logs from logger clearing when logger was lagging, now it didn't from first time, although it did later when I cleared like 10 times in a row 😄. Generally from that perspective it is working more stably.

I also tried running my project, but it worked slightly better and I think mostly because I disabled few "Run" commands, one of which had another "Run" command. I think maybe large use of "Run" commands is also causing overload and eventually crashes, since dummy tests script doesn't have many "Run" commands, meanwhile my project has a lot and many of them are used in several cycles.

Another weird thing which I noticed is that when logger is lagging, it is paused when actual pause starts, even though not all logs of commands are printed which go before that pause. This doesn't affect logs printed into files I think.

toddtarsi commented 4 months ago

@edvinjunda - I would love that! Yeah, I think that maybe I'm using an inefficient append operation or something like that, which causes the work of appending to scale up linearly. Thank you for testing!!!

edvinjunda commented 4 months ago

I guess this issue can be closed now

toddtarsi commented 4 months ago

@edvinjunda - Thanks for the help with everything!

github-actions[bot] commented 3 months ago

This issue has been automatically locked since there has not been any recent activity since it was closed. Please open a new issue for related bugs.