python / cpython

The Python programming language
https://www.python.org
Other
63.52k stars 30.43k forks source link

Memory leak: warnings module #97557

Open teplyakoff opened 2 years ago

teplyakoff commented 2 years ago

Bug report

Memory leak in warnings module Steps to reproduce:

  1. Create test script `$ cat > memory_leak_warnings.py << EOF
    import warnings import random

while True: warn_text = str(random.random()) warnings.warn(warn_text, DeprecationWarning) EOF`

  1. Run script docker run -it --rm --name memory-leak-warnings -v "$PWD":/usr/src/myapp -w /usr/src/myapp python:3.9 python memory_leak_warnings.py

  2. Monitor memory usage of container with running script $ docker stats memory-leak-warnings

And you can see MEM USAGE constantly increased Screenshot from 2022-09-26 10-48-45

https://docs.python.org/3/library/warnings.html Warnings is a standard module, anyone can use it. Having an application with a lot of dependencies - you have no idea which module use warnings module, so in production you will have memory leak. For example aiomysql use it https://github.com/aio-libs/aiomysql/blob/master/aiomysql/cursors.py#L479 and seems like it is ok to use warnings standard module.

I have to disable warnings to get rid such kind of memory leaks. Is it ok? Or maybe warnings must be disabled on production enviroments? If so - this should be noticed in documentaion.

Thanks

Your environment

teplyakoff commented 2 years ago
$ cat > memory_leak_warnings.py << EOF
import warnings
import random

while True:
    warn_text = str(random.random())
    warnings.warn(warn_text, DeprecationWarning)
EOF
vstinner commented 2 years ago

To be able to only emit a warning once, Python has to reminder which messages were already emitted. If the waring is always emitted, there is no such memory increase:

import warnings
import random
warnings.simplefilter("always", DeprecationWarning)
while True:
    warn_text = str(random.random())
    try:
        warnings.warn(warn_text, DeprecationWarning)
    except DeprecationWarning:
        pass

Maybe we should just document this behavior, and explain how to work around it if it becomes an issue.

teplyakoff commented 2 years ago

@vstinner or most common: warnings.simplefilter("always") My point is that I can use many dependencies, that dependencies has other dependencies, and some of library can use warnings module. And I can't know which type of warning it will warn DeprecationWarning, RuntimeWarnings or so So in production we have to do warnings.simplefilter("always") always to prevent such kind of memory leak. Or it should be default behaviour of warning module

vstinner commented 2 years ago

In 2017, I reduced the memory leak with commit c9758784eb321fb9771e0bc7205b296e4d658045 for the ignore action: issue #71722: Fix memory leak with warnings ignore (PR #4489).

teplyakoff commented 2 years ago

Am I understand correctly? using always and ignore filters is ok using other filter will result in a memory leak, so it is not for production environments!

is it expected?

ericvsmith commented 2 years ago

I believe @vstinner is saying that there's not a memory leak, but that using the feature requires memory. In order to provide the functionality of "only print a warning the first time " (that is, action is "default", "module", or "once"), the module has to remember which locations and/or messages it was seen before. That consumes memory.

Is this a practical problem for you? Your example is not a realistic, real-world use case.

teplyakoff commented 2 years ago

@ericvsmith functionality of "only print a warning the first time " (that is, action is "default", "module", or "once"), the module has to remember which locations and/or messages it was seen before. I didn't want to use such functionality. Moreover, I didn't even know about it. I just use python with default behavior of standard module. That's why I think it will be fine that default behavior will not eat so much memory.

About real world example. I have process constantly running as consumer of rabbitmq, after some business logic I have to insert some rows to mysql table. And this process leaks. After some research I figured out leak in mysql lib here - https://github.com/aio-libs/aiomysql/blob/5877a8803f2dae05ea6d3b9acecea1614ea7a973/aiomysql/cursors.py#L489 Actually not in library but in warnings module. Yes, you can tell me I should write good queries but tell me please what if aiomysql developer used logging module instead of warning to tell me about error? Would there be a memory leak? No

And again, my main point is: Your application can depend on many libraries that depend on another libraries and you have no idea which one uses warning module. Some day you will see memory leak, will try to detect it. And you figure out that problem is in warning.warn (used in some dependency of dependency) and all you can do it turn warnings off of use always filter.

I understand why this warning functionality consumes so much memory, I don't understand why it is by default in standard module. I prefer to use logging module without such problem, but I can't force all of library developer to use it instead of warnings. For now I have to use always and ignore filters to prevent such kind of memory leaks in future and it is seems like workaround.

P.S: I fixed my mysql query but I would like to have no memory leak because of this

vstinner commented 2 years ago

For deprecation warnings, there are two options:

In both cases, the memory cases should become stable.

For example, the memory usage is stable using python3 -Wignore::DeprecationWarning leak.py on your example.

teplyakoff commented 2 years ago

@vstinner Please reread my posts. DeprecationWarnin in first post is just very basic example how to reproduce memory leak. I posted link to mysql lib that uses Warning (not DeprecationWarning) and any lib can use any kind of warnings problem with behavior of warning module at all