tomcl / issie

Issie - an intuitive cross-platform hardware design application. https://tomcl.github.io/issie
GNU General Public License v3.0
66 stars 96 forks source link

Long simulation bug #442

Open tomcl opened 3 months ago

tomcl commented 3 months ago

Describe the bug White-screen crash with issie unresponsive after has been reported when doing long simulations of large designs.

To Reproduce Not yet enough information

Additional context This was reported Spring 2024 by an Issie user. It is hoped that they will be able to recreate the issue with dev tools open and report the stacktrace that will probably diagnose this problem.

tomcl commented 3 months ago

Instrumentation shows that the heap used by the simulation, while quite large, is not the determining factor here.

See #447 for how to make the simulator handle long simulations better with less heap use, but in prcatice that has no relevance to this issue.

samuelpswang commented 3 months ago

Steps To Reproduce Bug

  1. Open cpu in Demo Projects.
  2. Go to Simulations > Waveform Simulation > Start Simulation.
  3. Go to Select Waves, select all waves by opening sub-menus and making sure they are clicked. This should result in 290 waves selected, but over 200+ waves should work.
  4. Type in 200 in the Cycles input box. Presumably any large number before 1000 would do.
  5. Try to drag the DATAPATH custom symbol on the diagram on the left.
  6. Blank screen.

NB - Going forward or backwards in the waveform simulator will not result in crash.

Developer Tools Screenshot

Developer tools was turned on at start up, but did not show anything useful at time of crash. It showed that the devtool window was disconnected at time of crash. See screenshot below.

blank-screen-devtools

Next Steps

Not sure why this would happen or how to proceed. Can Dr. Clarke (@tomcl) provide pointers on how to verify whether this is caused by the aforementioned circular buffer issues?

tomcl commented 3 months ago

Well done! It is a bit of a mystery to me.

One suggestion - it is a memory problem. perhaps keeping dev tools open and graphing memory use and recreating the bug would show this (not sure whether dev tools can provide continuous indication of memory use - rather than graphing it at end after which maybe it will have crashed?).

Another suggestion: trace SVG generation. That would seem to be the issue?

Finally - there is some code that puts up a progress control when SVG generation takes too long. Maybe that is going wrong - but the issue I do not understand is why no backtrace. Perhaps in some circumstances an infinite loop?

tomcl commented 3 months ago

The heap size goes quite quickly up above 500MB in this situation and given dev tools itself uses heap it looks like the heap limit may be the problem. we could print out heap memory use to console to diagnose better because I do not trust dev tools heap profiling.

https://medium.com/weekly-webtips/increase-node-js-memory-limit-bonus-pm2-62cda081ce2a#:~:text=js%20(up%20to%2011.,js%20application.

The bug happens when expanding a small window for up to 1000 clocks.

https://github.com/electron/electron/issues/2056

Could be followed to increase the heap size and see if that helps - but we probably also want to limit the number of waveforms we create - fi there is a heap size issue.

This is not conclusive. it could also be some weird bug dependent on number of waveforms and number of clocks - perhaps to do with rerunning the simulation (which happens sometimes). But if increasing heap size fixes it we can be pretty sure it is that.

tomcl commented 3 months ago

I have tried implementing that fix. See commit on dev_tc for what I have done. I am not even sure the heap size is being increased because I cannot easily get the v8.getHeapSatatistics() method to work - it causes an infinite loop. Possibly this requires "require v8" - all this js stuff is a bit opaque to me, and because it is js via F# and fable it is obscure to a lot of people.

I am still not convinced this is a heap limit. Printing out memory usage after each simulation is crashed at 720MB. That is suspiciously close to the standard 32 bit memory limit of 700MB or so. But surely we run a 64 bit version of electron? And the fix to increase heap size to 7500MB should be working...

I guess it needs checking:

tomcl commented 3 months ago

The backytrace displayed by the main process (which I guess comes from the renderer) make sit clear this is a heap overflow probelm - but presumably one in the svg display code, not issie itself.

FATAL ERROR: Reached heap limit Allocation failed - JavaScript heap out of memory 1: 00007FF6AB500D36 node::SetTracingController+80134 2: 00007FF6AB500FF5 node::OnFatalError+645 3: 00007FF6AE15B873 v8::Function::NewInstance+851 4: 00007FF6AE15B808 v8::Function::NewInstance+744 5: 00007FF6AE21C417 v8::CppHeap::CollectGarbageInYoungGenerationForTesting+67671 6: 00007FF6ACDFF0E6 v8::CppHeap::GetHeapHandle+86326 7: 00007FF6ACDFA526 v8::CppHeap::GetHeapHandle+66934 8: 00007FF6ACDEC626 v8::CppHeap::GetHeapHandle+9846 9: 00007FF6AD012613 v8::CppHeap::GetHeapHandle+2262627 10: 00007FF63FE9C3F9 Error sending from webFrameMain: Error: Render frame was disposed before WebFrameMain could be accessed at s.send (node:electron/js2c/browserinit:2:86493) at .send (node:electron/js2c/browser_init:2:72283) at BrowserWindow. (C:\GitHub\issie\build\index.js:21626:25) at BrowserWindow.emit (node:events:525:35) Error sending from webFrameMain: Error: Render frame was disposed before WebFrameMain could be accessed at s.send (node:electron/js2c/browserinit:2:86493) at .send (node:electron/js2c/browser_init:2:72283) at BrowserWindow.tupledArg (C:\GitHub\issie\build\index.js:21631:27) at BrowserWindow.emit (node:events:513:28) Error sending from webFrameMain: Error: Render frame was disposed before WebFrameMain could be accessed at s.send (node:electron/js2c/browserinit:2:86493) at .send (node:electron/js2c/browser_init:2:72283) at BrowserWindow. (C:\GitHub\issie\build\index.js:21626:25) at BrowserWindow.emit (node:events:525:35)

tomcl commented 3 months ago

To wrap up the heap size maze.

https://github.com/electron/electron/issues/31330

It turns out that a hard heap size is limited to 4GB by a v8 pointer compression feature that electron will not ever be able to switch off. Total memory size is not so limited - unfortunately in F# almost everything is allocated on heap.

https://www.electronjs.org/blog/v8-memory-cage

That is sad - but not disastrous since Issie does not use anything like that amount of memory. The heap out of memory message here must be related to the SVG rendering process.

The symptoms are that this error happens only when making a large number of long SVGs.

Given this, my best guess is that the problem is the total SVG polyline path size, or the total SVG font size. We have 4 path points and up to one text element with 10 chars per clock cycle. Currently 1000 clock cycles are always generated.

It is therefore plausible that even though most of this stuff is not displayed, the SVG object size is too large. Part of the polylines is displayed and it is possible that therefore they are all processed. For 200 waveforms we have 1,000,000 polyline points.

Fixes

tomcl commented 3 months ago

checking the conditions for this crash to happen:

  1. it never happens when waveforms are zoomed out
  2. it happens when waveforms are zoomed in
  3. The heap use increases a large amount as waveforms are zoomed in (before the crash happens)

image

This shows memory use as waveforms are remade under conditions that will cause the crash at high zoom-in. The three levels of memory use correspond to making waveforms after pressing the zoom-in button twice. Each time memory usage goes up, and also the time taken to make the waveforms increases by a large amount.

Memory allocation trace shows most of the allocated memory is SVG text: higher zoom allocations: image

lower zoom allocations: image

The difference here reflects the fact that at higher zooms more numbers are printed on the waveforms.

Making a circuit with 1000 binary waveforms - each changing every cycle - the system seems much more robust and SVG regeneration time is constant with zoom.

So we have three fixes: