Akuli / porcupine

A decent editor written in tkinter
MIT License
155 stars 46 forks source link

memory leak #1193

Open Akuli opened 2 years ago

Akuli commented 2 years ago

memleak

This laptop has 2GB ram, 2GB swap. Porcupine had 4 tabs opened when I closed it, and had been opened the whole day. Based on the graph it used:

I also opened more tabs after taking the screenshot, but that didn't do anything.

Porcupine version is 2022.08.28

I spent by far most of the day editing rust files (tree-sitter highlighter, no langserver), mostly a file that is about 1400 lines long

Akuli commented 1 year ago

Seems to be the new highlighter. Edited a bunch of text files and asymptote files today (both with pygments), no memory usage problems.

Akuli commented 1 year ago

Next time this happens, I will try running gc.collect() in porcupine debug prompt. It seems that without an explicit gc.collect(), old tabs are sometimes deleted right away and sometimes after a while whenever the automatic GC run kicks in.

Akuli commented 1 year ago

gc.collect() doesn't seem to do anything. Next step would probably be making a script that reproduces the leak by e.g. typing code into porcupine

Moosems commented 1 year ago

undo redo stack? Maybe an arbitrary list somewhere that never gets used but keeps getting added to?

Akuli commented 1 year ago

It has to do with tabs: most things (undo-redo stacks etc) should be cleaned up when you close a tab, but not everything is getting cleaned.

I also tried gc.collect(). It didn't do anything.

Moosems commented 1 year ago

What information does the tab store?

Akuli commented 1 year ago

Not much in tabs.py, but most plugins have some kind of tab-specific data, so unless we assume it is the new syntax highlighter, we need to check them all.

The next step would be to make a script that reproduces the problem. For example:

We could then re-run the script with different settings in Porcupine: all plugins disabled, all plugins enabled, half of the plugins disabled, Rust syntax highlighting done with the old syntax highlighter, and so on.

Moosems commented 1 year ago

Keeping a debugger to calculate memory size of plugins would be good for the future

littlewhitecloud commented 1 year ago

Tkinter has the memory leak problem, if you use tkinter application for a long time, the application will take a lot of memory(even you try to do garbage collect.

Akuli commented 1 year ago

I don't think it's just tkinter having a problem. I know pretty well how tkinter works, and as long as you destroy widgets and make sure you don't keep around weird cyclic references with binds (which Porcupine does), it shouldn't leak. That said, next time this happens I might look around and see if there are many unnecessary widgets in memory.

Moosems commented 1 year ago

Can you try and reproduce the issue?

rdbende commented 1 year ago

I can reproduce this issue. After a few hours of use, Porcupine used half a gig of memory, and when reopened, it only used 70MB. When I open a new file with ~200 lines of Python code, Porcupine's memory usage increases by about 3 MB, but when I close that tab, the memory usage drops back by only 100-200 KB.

rdbende commented 1 year ago

Seems to be the new highlighter. Edited a bunch of text files and asymptote files today (both with pygments), no memory usage problems.

Disabling the highlight plugin doesn't help for me, but Porcupine uses 30 MB less on startup.

Moosems commented 1 year ago

Whats storing that much data in general?

rdbende commented 1 year ago

The plugins :)

Moosems commented 1 year ago

Thats a decent bit. Do the plugins work on a tab by tab basis?

rdbende commented 1 year ago

They are loaded once globally, but have functions bound the tabs (not all plugin though).

Moosems commented 1 year ago

So the binds aren't removing or......?

rdbende commented 1 year ago

It's not necessarily the bindings. It can be any other data.

Moosems commented 1 year ago

This is completely unrelated to the question, but what laptop are you using that only has 4 gigs ram total?

rdbende commented 1 year ago

A ThinkPad for example :)) But also my Sumsang laptop only has 4GB of RAM.

Akuli commented 1 year ago

It is indeed a thinkpad. It actually has 2GB RAM. The other 2GB is swap, so basically disk space that the OS treats as RAM when it runs out of real RAM.

20230630_004905

rdbende commented 1 year ago

Well, THAT IS a thinkpad :D

Moosems commented 1 year ago

Well, THAT IS a thinkpad :D

And a thoroughly used one.

Akuli commented 5 months ago

I upgraded the RAM to 4GB for now.

Moosems commented 5 months ago

A mem leak is still a mem leak 🤷‍♂️

Akuli commented 5 months ago

Sure, but it's no longer something I run into regularly.

Moosems commented 5 months ago

Have you been able to reproduce it?

Akuli commented 5 months ago

Yes, today I wrote a script that opens and closes tabs. It reproduces.

Moosems commented 5 months ago

Right then, time to narrow down the issue! Does this happen when all non necessary plugins are turned off?

Akuli commented 5 months ago

If you're so excited about this, you might as well narrow it down yourself :)

The script is a plugin on the memleak-test branch.

Moosems commented 5 months ago

I don't even use Porcupine, I simply see mem leaks as fairly important 😄

Akuli commented 5 months ago

Turns out that the core is leaky as well. Here are some results with all plugins (except the memleak test itself) disabled.

39.05MB (+2.01MB)      ********************
40.40MB (+1.35MB)      *************
41.75MB (+1.35MB)      *************
43.11MB (+1.35MB)      *************
44.46MB (+1.35MB)      *************
45.81MB (+1.35MB)      *************
47.16MB (+1.35MB)      *************
48.51MB (+1.35MB)      *************
49.86MB (+1.35MB)      *************
51.22MB (+1.35MB)      *************
52.57MB (+1.35MB)      *************
53.92MB (+1.35MB)      *************
55.27MB (+1.35MB)      *************
56.62MB (+1.35MB)      *************
56.89MB (+270.34kB)    **
57.16MB (+270.34kB)    **
57.43MB (+270.34kB)    **
57.70MB (+270.34kB)    **
57.97MB (+270.34kB)    **
57.97MB (+0B)          
58.25MB (+270.34kB)    **
58.52MB (+270.34kB)    **
58.52MB (+0B)          
58.79MB (+270.34kB)    **
59.06MB (+270.34kB)    **
59.33MB (+270.34kB)    **
59.33MB (+0B)          
59.60MB (+270.34kB)    **
59.60MB (+0B)          
59.87MB (+270.34kB)    **
60.14MB (+270.34kB)    **
60.14MB (+0B)          
60.41MB (+270.34kB)    **
60.68MB (+270.34kB)    **
60.68MB (+0B)          
60.95MB (+270.34kB)    **
61.22MB (+270.34kB)    **
61.22MB (+0B)          
61.49MB (+270.34kB)    **
61.76MB (+270.34kB)    **
61.76MB (+0B)          
62.03MB (+270.34kB)    **
62.30MB (+270.34kB)    **
62.30MB (+0B)          
62.57MB (+270.34kB)    **
63.30MB (+733.18kB)    *******
63.57MB (+266.24kB)    **
63.57MB (+0B)          
63.84MB (+270.34kB)    **
64.11MB (+270.34kB)    **
64.11MB (+0B)          
64.38MB (+270.34kB)    **
64.65MB (+270.34kB)    **
64.65MB (+0B)          
64.92MB (+270.34kB)    **
65.19MB (+270.34kB)    **
65.19MB (+0B)          
65.46MB (+270.34kB)    **
65.73MB (+270.34kB)    **
65.73MB (+0B)          
66.00MB (+270.34kB)    **
66.00MB (+0B)          
66.27MB (+270.34kB)    **
66.54MB (+270.34kB)    **

Each line in the results is the result of opening and closing 10 tabs. In other words, the memleak test plugin does this in a loop:

Moosems commented 5 months ago

Well thats certainly a development

Akuli commented 5 months ago

I'm discovering more stuff...

  1. Disable all plugins except debug prompt plugin and the memleak test
  2. Run the memleak test a little bit (I don't care what happens when the first tab is opened, because that won't accumulate over time)
  3. Open debug prompt and run import porcupine, gc, then set porcupine.a = {id(x) for x in gc.get_objects()}
  4. Run memleak test more
  5. Open debug prompt again and set b = [x for x in gc.get_objects() if id(x) not in a]

This creates a list of about 60000 Python objects, which presumably consists of the debug prompt and a bunch of leaked stuff. About 40000 of these are cell objects, whatever they are (can't find in gc docs). I will later go through some of the remaining 20000 objects one by one to find out what they are.

Edit: Now that I do this again, I got only 2000 cell objects, and about 20000 total. Weird.

Akuli commented 5 months ago

I looked at the types of the leaky objects. (But again, this includes the debug tab itself, so one tkinter.ttk.Scrollbar is expected, for example.)

>>> pprint.pprint(dict(Counter(str(type(o)) for o in b)))
{"<class '_io.StringIO'>": 1,
 "<class 'builtin_function_or_method'>": 6,
 "<class 'cell'>": 1810,
 "<class 'contextlib.redirect_stderr'>": 1,
 "<class 'contextlib.redirect_stdout'>": 1,
 "<class 'dataclasses.Field'>": 902,
 "<class 'dataclasses._DataclassParams'>": 901,
 "<class 'dict'>": 3963,
 "<class 'function'>": 3608,
 "<class 'getset_descriptor'>": 1802,
 "<class 'list'>": 905,
 "<class 'method'>": 17,
 "<class 'porcupine.plugins.porcupine_debug_prompt.PromptTab'>": 1,
 "<class 'set'>": 903,
 "<class 'tkinter.CallWrapper'>": 7,
 "<class 'tkinter.Event'>": 1,
 "<class 'tkinter.Text'>": 1,
 "<class 'tkinter.font.Font'>": 3,
 "<class 'tkinter.ttk.Frame'>": 4,
 "<class 'tkinter.ttk.Scrollbar'>": 1,
 "<class 'tuple'>": 3646,
 "<class 'type'>": 902,
 "<class 'weakref.ReferenceType'>": 901}

A couple things stick out to me as weird:

This weird function is probably responsible for the dataclasses related stuff:

https://github.com/Akuli/porcupine/blob/efcd3b1c9e597a2fb4cdaadd1a51308cd7d829db/porcupine/settings.py#L75-L82

I want to rewrite porcupine.settings anyway, so I hope it will go away eventually.

Functions are probably the leaking thing. Maybe tkinter doesn't interact particularly well with lambdas, closures, and reference counting.

I think the methods are just the debug prompt's stuff, not leaked memory. Here they are:

[<built-in method splitlist of _tkinter.tkapp object at 0x7f991706a9b0>,
 <built-in method call of _tkinter.tkapp object at 0x7f991706a9b0>,
 <built-in method splitlist of _tkinter.tkapp object at 0x7f991706a9b0>,
 <built-in method call of _tkinter.tkapp object at 0x7f991706a9b0>,
 <built-in method splitlist of _tkinter.tkapp object at 0x7f991706a9b0>,
 <built-in method call of _tkinter.tkapp object at 0x7f991706a9b0>,
 <bound method Misc._substitute of <tkinter.Text object .!panedwindow.!panedwindow.!tabmanager.!prompttab2.!text>>,
 <bound method CallWrapper.__call__ of <tkinter.CallWrapper object at 0x7f991682f550>>,
 <bound method Misc._substitute of <tkinter.Text object .!panedwindow.!panedwindow.!tabmanager.!prompttab2.!text>>,
 <bound method CallWrapper.__call__ of <tkinter.CallWrapper object at 0x7f991682e110>>,
 <bound method Misc._substitute of <tkinter.Text object .!panedwindow.!panedwindow.!tabmanager.!prompttab2.!text>>,
 <bound method CallWrapper.__call__ of <tkinter.CallWrapper object at 0x7f991682f610>>,
 <bound method Scrollbar.set of <tkinter.ttk.Scrollbar object .!panedwindow.!panedwindow.!tabmanager.!prompttab2.!scrollbar>>,
 <bound method CallWrapper.__call__ of <tkinter.CallWrapper object at 0x7f99168ed910>>,
 <bound method YView.yview of <tkinter.Text object .!panedwindow.!panedwindow.!tabmanager.!prompttab2.!text>>,
 <bound method CallWrapper.__call__ of <tkinter.CallWrapper object at 0x7f9916804f50>>,
 <bound method Misc._substitute of <porcupine.plugins.porcupine_debug_prompt.PromptTab object .!panedwindow.!panedwindow.!tabmanager.!prompttab2>>,
 <bound method CallWrapper.__call__ of <tkinter.CallWrapper object at 0x7f991672eb90>>,
 <bound method PromptTab.on_enter_key of <porcupine.plugins.porcupine_debug_prompt.PromptTab object .!panedwindow.!panedwindow.!tabmanager.!prompttab2>>,
 <bound method Misc._substitute of <tkinter.Text object .!panedwindow.!panedwindow.!tabmanager.!prompttab2.!text>>,
 <bound method CallWrapper.__call__ of <tkinter.CallWrapper object at 0x7f991672d310>>,
 <bound method _RedirectStream.__exit__ of <contextlib.redirect_stdout object at 0x7f991672c190>>,
 <bound method _RedirectStream.__exit__ of <contextlib.redirect_stderr object at 0x7f991672e110>>]

I think this is as far as I will look into this for now.

Akuli commented 5 months ago

Just looked at the leaky functions. The weird function in porcupine.settings is leaking a lot of auto-generated __init__, __repr__ and __eq__ methods, lol. Not exactly what I expected the leak to be, but we'll see if that's really the problem, whenever I feel like continuing this :)

Akuli commented 5 months ago

lol :DDDD

I added these lines into the weird func...

    del ValueContainer.__init__
    del ValueContainer.__repr__
    del ValueContainer.__eq__

and it already leaks noticeably less :D Still leaky though.