prompt-toolkit / python-prompt-toolkit

Library for building powerful interactive command line applications in Python
https://python-prompt-toolkit.readthedocs.io/
BSD 3-Clause "New" or "Revised" License
9.11k stars 718 forks source link

[memory leak]PromptSession wouldn't free #1688

Closed vicalloy closed 1 year ago

vicalloy commented 1 year ago

I use PromptSession with prompt_toolkit.contrib.telnet, and find it have serious memory leak. PromptSession will never free.

  1. Add self.m = list(range(1000*1000)) to prompt_toolkit.shortcuts.prompt.PromptSession.__init__
  2. Run the follow code. The memory increase fast and never free.

I try to find out the reason. from prompt_toolkit.filters.base import _and_cache, _or_cache, _invert_cache will hold Filter, and the Filter hold PromptSession. prompt_toolkit.cache.memoized also hold PromptSession.

I can free _and_cache, _or_cache, _invert_cache, but don't have any way to free the cache hold by memoized. I add a var _memoized_cache_set to access cache created by memoized. After clear _and_cache/_or_cache/_invert_cache/memoized, PromptSession will free. If called await session.prompt_async(), PromptSession wouldn't free again(only one instance can't free).

import asyncio
import time
import gc

from prompt_toolkit import PromptSession

async def create_session():
    try:
        session = PromptSession()
        # await session.prompt_async()
    except KeyboardInterrupt:
        print("pass")
        del session
        gc.collect()

async def m():
    for _ in range(3):
        await create_session()
        time.sleep(3)

asyncio.run(m())
jonathanslenders commented 1 year ago

That's a good find. This isn't great. We can't do without the memoization, or (CPU) performance will be worse. I don't know what's best, either use some weak references, or have an Filter._and_cache: _AndCache attribute instead of a global. The latter could work.

jonathanslenders commented 1 year ago

There should be something else. I did an attempt to remove the global state in the filters here: https://github.com/prompt-toolkit/python-prompt-toolkit/pull/1690

But, even removing that cache completely + removing any cache from memoization elsewhere is not sufficient. I don't have much time right now, but I guess there's maybe another place that refers the session instance from a global state.

FYI, I'm using this to count the number of instances:

import gc
obj = gc.get_objects()
print('len', len([o for o in obj if isinstance(o, PromptSession)]))
vicalloy commented 1 year ago

Thanks for you reply. I add a var _memoized_cache_set to access @memoized. After clean all cache, PromptSession can be free. I add a function to clean all cache, and called it after a telnet connection disconnected. It isn't elegant but it works.

import asyncio
import gc
from prompt_toolkit.filters.base import _and_cache, _or_cache, _invert_cache
from prompt_toolkit.cache import _memoized_cache_set

from prompt_toolkit import PromptSession

def clean_cache():
    _and_cache.clear()
    _or_cache.clear()
    _invert_cache.clear()
    # _memoized_cache_set used to access @memoized
    for cache in _memoized_cache_set:
        cache.clear()

async def print_count():
    # KeyProcessor timeout is 1s. PromptSession will free after 1s
    await asyncio.sleep(1)  # waiting KeyProcessor timeout
    clean_cache()
    gc.collect()
    obj = gc.get_objects()
    print('PromptSession count: ', len([o for o in obj if isinstance(o, PromptSession)]))

async def m():
    for i in range(2):
        session = PromptSession()
        try:
            await session.prompt_async()
        except KeyboardInterrupt:
            print("pass", i)
        del session
        await print_count()

asyncio.run(m())
jonathanslenders commented 1 year ago

I think I got it. Can you check https://github.com/prompt-toolkit/python-prompt-toolkit/pull/1690 again?

(I can't merge it right away though, it will break ptpython as it is, because every Filter subclass requires calling super(), which we forgot in one place in ptpython. But I'll push a new ptpython release and some time late this can get merged.)

vicalloy commented 1 year ago

Thanks you. https://github.com/prompt-toolkit/python-prompt-toolkit/pull/1690 works.