monero-project / monero

Monero: the secure, private, untraceable cryptocurrency
https://getmonero.org
Other
8.78k stars 3.08k forks source link

monerod hogging CPU on OpenBSD after synchronisation #7027

Open tdog8622 opened 3 years ago

tdog8622 commented 3 years ago

I compiled v0.17.1.3 on OpenBSD 6.8 (current).

After starting monerod by invoking

monerod --hide-my-port --in-peers 0 --limit-rate 500 --max-concurrency 1 --max-concurrency 1 --block-sync-size 1

the monerod daemon syncs up with reasonable CPU usage but after the message that it is synchronised to the network it starts hogging my CPU and my machine freezes until I manage to kill the proccess.

moneromooo-monero commented 3 years ago

If you have the perf tool (not sure it's linux only): sudo perf top -a You might have something similar.

moneromooo-monero commented 3 years ago

Failing that, getting a dozen stack traces should probabilistically show one or two in the hot code.

tdog8622 commented 3 years ago

perf is Linux only; Here are the traces I did with ktrace:

( https://man.openbsd.org/ktrace.1 ) ( https://man.openbsd.org/kdump )

kdump001.txt kdump002.txt kdump003.txt kdump004.txt kdump005.txt kdump006.txt kdump007.txt kdump008.txt kdump009.txt kdump010.txt kdump011.txt kdump012.txt

ktrace.out files (which I renamed from .out to .gz in order to be able to upload them):

ktrace001.gz ktrace002.gz ktrace003.gz ktrace004.gz ktrace005.gz ktrace006.gz ktrace007.gz ktrace008.gz ktrace009.gz ktrace010.gz ktrace011.gz ktrace012.gz

(The freezing behaviour happens every time that I run moneord and should be present in every of the above traces)

I further observed that monerod seems to cause my WiFi connection to drop after my system freezes. While the PC is not connected to the network the systems become responsive again, only to freeze again shortly after a network connection is reestablished.

moneromooo-monero commented 3 years ago

That looks like an strace equivalent, not quite what I was after. Your observation is interesting. Does it still happen if you set --limit and/or --out-peers/--in-peers to small values ? Also "--disable-dns-checkpoints --check-updates disabled" will test whether the problem is due to DNS queries (though it'll still check the seed domains, I don't think there's an off switch for that).

tdog8622 commented 3 years ago

It stills happens if I invoke:

monerod --out-peers 5 --in-peers 5 --disable-dns-checkpoints --check-updates disabled --limit-rate 100

If I run monerod --offline my system does not freeze.

I have to admit that I do not know which tool to use to get the stack trace that you wanted.

moneromooo-monero commented 3 years ago

It's not a stack trace, but a profile. To see where the CPU is going. In any case, I have another quick and dirty patch elsewhere that's meant to address some CPU issues to do with the net layer, so maybe it will happen to help:

https://paste.debian.net/hidden/da8b1aa6/

xxd -r < FILENAME > net-patch.gz gzip -d net-patch.gz patch -p1 < net-patch

selsta commented 2 years ago

@BebeSparkelSparkel can you help with producing the profile moneromooo is asking for? that would help us find the issue

BebeSparkelSparkel commented 2 years ago

There was a suggestion on reddit

Welcome to OpenBSD. SMP is much better than it used to be, but development is not as advanced as some other OS. You may be able to help figure out what's going on with some information from btrace (https://marc.info/?l=openbsd-misc&m=164181728803834 has some tips about running it). Or you might find this workload works better running a GENERIC kernel rather than GENERIC.MP.

I have tried without multiprocessing but with the same result.

btrace may produce the profile that you are looking for. I will try to get that profile soon.

BebeSparkelSparkel commented 2 years ago

An OpenBSD developer gave this insight to me reporting that the system was in a state of 100% Sys and 100% Spin

the daemon is doing something that is tying up system resources. Spinning is when something is trying to grab a kernel lock but is unable to.

https://www.reddit.com/r/openbsd/comments/txdt4o/comment/i3orkh0/?utm_source=share&utm_medium=web2x&context=3

offshoremonero commented 1 year ago

I think it has something to do with lmdb and WRITEMAP.

If you comment out the MDB_WRITEMAP lines in src/blockchain_db/lmdb/db_lmdb.cpp the issue goes away, but you end up with a broken lmdb. :(