liyunfan1223 / mod-playerbots

AzerothCore Playerbots Module
https://discord.gg/NQm5QShwf9
GNU Affero General Public License v3.0
180 stars 96 forks source link

Increasing memory usage until crash. #289

Open aronkleinhans opened 1 week ago

aronkleinhans commented 1 week ago

Describe the bug So after starting the server memory usage slowly increases from like 3 GB to 8Gb over 7-8 hours. Then the server crashes with bad_alloc.

Commit hash [15d4b7a]

To Reproduce Steps to reproduce the behavior:

  1. build server
  2. start server
  3. have about 150-200 bots
  4. see memory fill up

Expected behavior Memory usage settles after a few minutes uptime.

Screenshots If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

Additional context I couldn't find any solution to this, found some issues that lead nowhere, like the one about memory leaks but no indication of the solution being shared.

noisiver commented 1 week ago

The thing is, 8GB RAM isn't going to be enough over a longer period of time. All the maps it loads into memory aren't being unloaded. I don't have crashes, and my servers run 24/7 and only restart once a week (unless I trigger a manual restart). A full preload will eat ~11-12GB RAM and if I limit it to maps accessible by the bots it takes 6-7GB. Add on top of that everything else that eats memory.

Seeing as the bots move around, maps will continue to be loaded. If they stayed in a small area it wouldn't because nothing is there to load them.

A fresh startup for me uses about 7623MB image

5.5 hours later it's using 7998MB image

An increase of 375MB after 5.5 hours isn't uncommon and not a leak. I do preload maps 0, 1, 369, 530 and 571 which is why it starts at 7.6GB.

aronkleinhans commented 1 week ago

I see, thanks for explaining!

aronkleinhans commented 1 week ago

Increased swap to 16GB Loaded all grids Seems stable for now, can't really say anything about performance.

aronkleinhans commented 5 days ago

Yeah so here's how it looks after half a day Just after start IMG_5261 And after half a day IMG_5262 The extra 100% cpu usage and random 1GB RAM usage is with all unused grids loaded and 200 max bots. With a huge swap and preload it is stable. But this is a sign of some leak I think. Might be a linux specific issue (I suspect even ARM specific), But base AC doesn't have it.

Without preload it crashes 9h in even with big swap. I can do more swap and it's with a nvme ssd pciex3 but the extra cpu will bog it down.

Sure I'll keep observing. I want this to work..

Anyone willing to help me look into this?

aronkleinhans commented 5 days ago

#273 seems related

aronkleinhans commented 5 days ago

Turning off random bots logging in and existing altogether seems to alleviate the issue... So you can still use this to play solo or 2-3 players plus bots raid, but With your own accounts and characters you need to level up yourself normally or via commands.

Good news is this means no bg or arena! so you can use Gain honor on guard kill with elite honor gain enabled to do advance in pvp stuff. No arena tho...maybe with 2 clients open you can play against your own acc? (n vs n-1[unless you write a bot xD])

noisiver commented 5 days ago

This was posted in the playerbots discord server. image image

~Over 5 days with a memory usage of 6.7GB and no crashes, and with 1000 bots.~ I'm sorry, I confused the usage with another user. Either way, close to 6 days uptime without crashing.

I don't know what architecture they're using, probably not ARM though. I can't say I've run into any issues at all with the ARM CPUs I use.

My realms have only been running for an hour but I'll get back here tomorrow with information about how they're doing in terms of memory usage and stability.

aronkleinhans commented 4 days ago

Weirdly enough, I restarted yesterday and it's been stable for 10h 30m no increase in ram or cpu usage. I'll let it run to see what's up.

Edit: 20h stable. Likely config issue? Or black magic.

Can you pls share you playerbot.conf?

noisiver commented 3 days ago

It is actually slowly creeping up. I have no idea why or if it's normal, or if it'll stop increasing eventually. It won't crash for me though because long before it reaches the point of the system being out of memory, assuming it won't stop, my weekly maintenance will reset it.

Here is just started. It's using 16.3%, 5062MB virt and 3908MB reserved image

Here is after 11 hours. It's using 19.8%, 5946MB virt and 4741MB reserved image

And finally, here is after 24 hours. It's using 21.6%, 6410MB virt and 5183MB reserved image