torproject / nyx

Command-line monitor for Tor.
https://nyx.torproject.org/
GNU General Public License v3.0
123 stars 26 forks source link

Needs more than two gigabyte RAM if running for two days #52

Open arch-user-france1 opened 2 years ago

arch-user-france1 commented 2 years ago

Nyx seems to have a memory leak. If it runs for a long time it will consume my whole swap. This time it ran 2 days only and needs 3GB swap and 1GB RAM of and won't close until the swap is freed or you press ctrl+c multiple times.

arch-user-france1 commented 2 years ago

It seems that it's freezing computers that don't have any free RAM anymore. Nyx caused a totalfreeze the day before yesterday.

InsaneSplash commented 2 years ago

After running it for an hour it chomped through a large portion of my free memory.

terminaldweller commented 1 year ago

it appears to be a stem issue rather than a nyx issue.

atagar commented 1 year ago

it appears to be a stem issue rather than a nyx issue.

What made you conclude that? Thus far there doesn't appear to be much actionable information on this ticket.

terminaldweller commented 1 year ago

I ran a memory profiler on nyx to try to figure out who's hugging all the memory. I ran memray run --live-remote --native run.py and this is what I have so far: This is run.py:

#!/usr/bin/env python3

import nyx.starter
import sys

def main():
    try:
        nyx.starter.main()
    except ImportError as exc:
        print("Unable to start nyx: %s" % exc)
        sys.exit(1)

if __name__ == "__main__":
    main()

mem_profile

$ cat /etc/os-release
PRETTY_NAME="Ubuntu 22.04.1 LTS"
NAME="Ubuntu"
VERSION_ID="22.04"
VERSION="22.04.1 LTS (Jammy Jellyfish)"
VERSION_CODENAME=jammy
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=jammy
$ python3 --version
Python 3.10.6

stem comes from whatever version apt is pulling in for nyx on ubuntu 22.04(its stem 1.8.0). The nyx code I used was against latest commit on the master branch (commit dcaddf2ab7f9d2ef8649f98bb6870995ebe0b893).

arch-user-france1 commented 1 year ago

So someone would have to open an issue there instead. Will anyone do it? Certainly I could, too... But I don't really want to. It might as well still be a problem in nyx, though. Keep in mind. Maybe nyx re-creates some class from stem and does not delete it afterwads. I am not sure, because I do not know what this profiler exactly displays.

atagar commented 1 year ago

I ran a memory profiler on nyx to try to figure out who's hugging all the memory.

That's a neat visualization. Thanks Farzad!

Unfortunately it doesn't narrow down the haystack much. It indicates that Nyx makes copious get_info and get_conf calls which is expected.

Memory issues are particularly thorny to track down. My first thought is 'why now?', by which I mean that Nyx and Stem were released in 2019. If I still worked at Tor I'd start by checking with the wider community to see how prevalent this issue is.

So someone would have to open an issue there instead. Will anyone do it?

Honestly it doesn't matter. I'm the author of both Stem and Nyx. I left Tor a couple years ago whereupon Georg (gk at torproject dot org) took over but he doesn't monitor GitHub. You'll need to email him if you'd like to get his attention (see this issue for the latest discussion on that). You can also try tor-dev@.

TL;DR: Nyx and Stem are presently unmaintained. I'm sorry I don't have better news for you.

terminaldweller commented 1 year ago

I could make a bunch of containers with different versions for everything and see when we actually run into this issue. As for the memray result, I agree. It's not much to go on. Its basically just pointing out that those decorators are taking up a lot of memory but the how or why is unclear to me(I'm not very good with python). I could also get a line-by-line memory read-out but I'll try the different versions first. As for contacting gk I'd rather have a PR ready first.

arch-user-france1 commented 1 year ago

I ran memray --live-remote --native run.py and this is what I have so far

I'd like to reproduce your result, however you seem to use a different version of memray.

memray: error: argument command: invalid choice: 'run.py' (choose from 'run', 'flamegraph', 'table', 'live', 'tree', 'parse', 'summary', 'stats', 'transform', 'attach')

Do you by chance know what went wrong?

it doesn't make any sense:

(base) ➜  nyxmemleak memray --native                         
usage: memray [-h] [-v]
              {run,flamegraph,table,live,tree,parse,summary,stats,transform,attach}
              ...
memray: error: the following arguments are required: command
(base) ➜  nyxmemleak memray --native run            
usage: memray run [-m module | -c cmd | file] [args]
memray run: error: the following arguments are required: file, module
(base) ➜  nyxmemleak memray --native run run.py 
usage: memray [-h] [-v]
              {run,flamegraph,table,live,tree,parse,summary,stats,transform,attach}
              ...
memray: error: unrecognized arguments: --native
(base) ➜  nyxmemleak memray --native run run.py
terminaldweller commented 1 year ago

my bad. it should be memray run --live-remote --native ./run.py and it should return a port number. and then from another termianl memray live port-number. I forgot to include the run part.

arch-user-france1 commented 1 year ago

My idea is that nyx is calling a lot of wrapped which seems to be a function to use something with default arguments (if you ask me, a very special way of implementing defaults).

Unfortunately it is very hard to debug this code, especially because of its size. Regarding the unindentifyable thing that's hogging most memory:

_pyfunction_vectorcall is a C-level function in the Python interpreter's API that was introduced in Python 3.8 as an alternative calling convention for Python functions. It is designed to improve the performance of calling Python functions by allowing the interpreter to make use of SIMD (Single Instruction, Multiple Data) instructions on modern processors.

and

_pyfunction_vectorcall is used internally by the interpreter when calling certain types of Python functions, such as those defined using the new "vectorcall" protocol. This protocol is a new feature in Python 3.9 that allows function calls to be more efficient by passing arguments and return values in a more compact format that is easier for the interpreter to handle.

So I assume that some function calls remain in the memory.

It is interesting that I can not observe that memory leak on my Fedora system. For how long have you been running it? I've run it for 30 minutes. Maybe that wasn't enough and now my ssh connection broke.

Has anyone got a better idea? Is anyone still interested?

terminaldweller commented 1 year ago

the memory leak is not a matter of how long. it is a constant leakage. you can see that it keep increasing periodically by a very small amount. it gets big over time. which fedora version are you using? and how did you install nyx? with pip or dnf? Also could you please paste the version of python,stem and nyx that you say is not leaking memory?

terminaldweller commented 1 year ago

here's a flamegraph that memray has made for a "leaky" session if anybody wants to take a look: https://cargo.terminaldweller.com/memray-flamegraph-run.py.1817471.html

arch-user-france1 commented 1 year ago

here's a flamegraph that memray has made for a "leaky" session if anybody wants to take a look: https://cargo.terminaldweller.com/memray-flamegraph-run.py.1817471.html

What is this supposed to be - there are surely some files missing?

grafik

No interactivity at all.

arch-user-france1 commented 1 year ago

Also could you please paste the version of python,stem and nyx that you say is not leaking memory?

I am not sure if there is a leak, what I know is it ran fine for 30 minutes, which is not an extreme time. I've just started memray in two screens and will report the result later. I'm using: Python 3.9.13 nyx version 2.1.0 (released January 12, 2019)

Mar 03 17:39:59.090 [warn] Tor was compiled with zstd 1.5.2, but is running with zstd 1.5.4. For safety, we'll avoid using advanced zstd functionality.
Tor version 0.4.7.13.
Tor is running on Linux with Libevent 2.1.12-stable, OpenSSL 3.0.8, Zlib 1.2.12, Liblzma 5.4.1, Libzstd 1.5.4 and Glibc 2.36 as libc.
Tor compiled with GCC version 12.2.1
terminaldweller commented 1 year ago

its probably because of the nginx settings that I have for that file server. just download the html file and open it then. its around 26 MB. that should fix it.

arch-user-france1 commented 1 year ago

Indeed. It's your CSP. Unfortunately can't download the file, it's downloading indefinitely without any end in sight and a speed of ~100KB/s, around 70% stuck at no traffic at all.

It's yet got 2%, but it's mostly stuck and the html file is certainly not as large as advised.

Anyway, nyx was running for a long time now and there have not been any issues:

grafik

Could you please try with the following program versions or confirm you have the same installed?

Python 3.9.13 nyx version 2.1.0 (released January 12, 2019) Tor version 0.4.7.13.

Output of tor --version should be:

Mar 03 17:39:59.090 [warn] Tor was compiled with zstd 1.5.2, but is running with zstd 1.5.4. For safety, we'll avoid using advanced zstd functionality.
Tor version 0.4.7.13.
Tor is running on Linux with Libevent 2.1.12-stable, OpenSSL 3.0.8, Zlib 1.2.12, Liblzma 5.4.1, Libzstd 1.5.4 and Glibc 2.36 as libc.
Tor compiled with GCC version 12.2.1

I assume it's a bug in the python interpreter because the issue does not occur to me anymore. I really think Tor does not maintain these packages well enough. It's annoying to see the zstd version is older and there has not been any update for a while now.

terminaldweller commented 1 year ago

Python and nyx version are the same. stem is v 1.8.0. As for tor --version:

$ tor --version
Tor version 0.4.7.13.
Tor is running on Linux with Libevent 2.1.12-stable, OpenSSL 3.0.2, Zlib 1.2.11, Liblzma 5.2.5, Libzstd 1.4.8 and Glibc 2.35 as libc.
Tor compiled with GCC version 11.3.0

I am running nyx in a conainter:

FROM fedora:34

RUN dnf update -y && dnf install -y nyx

But I am still using a lot of memory. 155 MB in 6 hours. The only thing different appears to be the versions of the libs tor was built with.

arch-user-france1 commented 1 year ago

You mentioned 155MB in 6 hours. Maybe I did not run it long enough... My stem version is 1.8.1:

(base) ➜  ~ python                   
Python 3.9.13 (main, Aug 25 2022, 23:26:10) 
[GCC 11.2.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import stem
>>> stem.__version__
'1.8.1'

I'll run it for 24 hours then, to check if I really do not experience any issues. You could as well as me try to use the official tor repositories and fedora 37, and check if there are issues with that.

arch-user-france1 commented 1 year ago

Can not reproduce the memory leak anymore.

adrelanos commented 11 months ago

quote @atagar

I'm the author of both Stem and Nyx. I left Tor a couple years ago whereupon Georg (gk at torproject dot org) took over but he doesn't monitor GitHub. You'll need to email him if you'd like to get his attention (see this issue for the latest discussion on that). You can also try tor-dev@.

Try this if you want to see this fixed.

h3xagonal commented 5 months ago

Conducted another test of nyx and recorded some results that could be of use if anyone wants to take up further investigation.

Nyx was started on a Debian 12 system with Tor 0.4.8.11, nyx commit dcaddf2, stem commit 2074fdee installed as a virtualenv. The idle memory usage was recorded.

This system was configured as a gateway with transparent-proxying and SOCKS ports. Traffic comprising of many discrete circuits was generated by a client. After some time the client system was shut down and NEWNYM was sent to the Tor control-port. The memory usage did not decrease substantially

Memory usage of nyx before stress-test. (ps aux)

USER         PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
user        5729  5.9  4.8 644476 47636 pts/0    Sl+  13:48   0:00 python ./run_nyx

Memory usage of nyx after stress-test.

user        5729  1.8 41.4 996824 407392 pts/0   Sl+  13:48   0:19 python ./run_nyx

Conclusion: The memory leak still occurs under conditions of stress when testing present versions of utilities in question.