SpongePowered / Sponge

The SpongeAPI implementation targeting vanilla Minecraft and 3rd party platforms.
MIT License
383 stars 210 forks source link

Chunk Pre-Genning Issue? #841

Closed IBGnova closed 7 years ago

IBGnova commented 8 years ago

Hello, So I recently tried to pre-gen my world, but I hit a few bumps along the way. Firstly I was using project borders 0.1.0: https://github.com/trentech/ProjectBorders/releases/tag/0.1.0 I was trying to gen a world of 15k by 15k with the command /border generate world -i 20 -p 0.3 When I tried that it would run fine for about an hour or so, and then it would crash with this log: http://pastebin.com/gshJzn0h . So after that I tried pre-genning it with the command /border generate world -i 20 -p 0.2 . And then it was working just fine, but after a few hours of no one being on the server (It was white-listed at the time) it started building up mass amount of entities and seemed to slow to a crawl, at first it was showing in console that it was loading 10-20 chunks, but then by the time I woke up from sleeping (I just let it run over night) it was down to only loading 2 chunks every time it spit something in console.

Here is a list of all info I have: Crash Log: http://pastebin.com/gshJzn0h Project Borders Plugin: https://github.com/trentech/ProjectBorders/releases/tag/0.1.0

Mods/Plugins: Sponge Forge 1542 Forge 1902 Pixelmon 4.2.7

Any other info that is needed I would be more than happy to provide. @DDoS

theboomer commented 8 years ago

Other pregenerating plugins also seem to bang into a wall after not long of pregenerating a world for 1.10 - on mine, the author states he is simply using the built-in sponge pre-generator, but i can't get more than 15% of a 12kx12k world done before crashing out of memory @6Gig assigned to server. I should see if I can monitor entity counts as well, if two different projects (which arguably call the same core code) give similar behavior re: entity explosion

theboomer commented 8 years ago

I did a bunch of further testing; these crashes and bad rates of production, and constantly growing entity counts were done with a version of spongeforge from several days ago. I made a plugin to output entity-count in the world at 20s intervals to the log, and repeated this same pregeneration, slowly crawling shortly after starting, majorly consuming memory without really seeing any freed up, and terminated the results after just 2h.

Then I repeated the tests last night with the latest versions , using the exact same map seed - there were Chunk-GC code changes and some other chunk optimizations added then - and the results were phenominally different - it was blazingly fast to start, and maintained that same generation rate steadily for 9.5h until finishing (generating 100% of a 12k diameter map) , but the total entity counts had remained climbing the entire time still, ending with 125k entities in world. (no one logged into server at all, just console commands the entire time of testings). Taskmanager showed about 4.8gig memory usage for the process, and that held fairly constant for 30mins beyond completion, simply logging an unchanging entity count for 30mins longer.

https://forums.spongepowered.org/t/project-worlds-v0-9-1-world-manager/10170/428

Whatever changes were made in chunk-gc are clearly amazing in their effect here under identical circumstances otherwise, but entity counts dont go down, which would suggest that pregenerating doesn't unload the map chunks?

If unloading them is a responsiblity of the plugin utilizing the generator code, it may be simply that... Time to write a simple pregenerator test and see, I think I might attempt that now.

ST-DDT commented 8 years ago

What does the green and the yellow bar on those chart show? Could you show us the test code you used for those tests?

theboomer commented 8 years ago

the green is total cumulative chunk-generated count, the orange/yellow/brown one is world total entity count - a bit of logfile manipulation and spreadsheet magic gives me a summation of the cumulative chunks at the timepoints where the entity-total is spit out (reducing 100k+log lines to a few thousand points) The plugin being used is: projectworlds-5.0.0-0.10.1.jar (TranTech ) sorry i cant find a link now to his source code, but I'm using his /world fill command with whatever default values he's pushing into his generator the only code im using is this method scheduled to start 30s after server initialization, which calls itself:

 public void entCount() {
        System.out.println("Entitycount = " + world.getEntities().size());
        Sponge.getScheduler().createTaskBuilder().delay(20, TimeUnit.SECONDS).execute(c -> this.entCount()).submit(this);
    }

I'm trying to make a all-defaults pregenerator plugin right now but cant seem to get it to log the results to the logger re: its generation and progression yet.

EDIT: Realized the secret weapon, JDGUI - to view his jar file, unfortunately, the class that has his fill command throws up an //internal error// message, the only file in his plugin that does, the one needed to peek at

theboomer commented 8 years ago

73beb0b3ec494f6eb7c5a37f24daf92f 1

Wrote a tiny plugin to simply automatically start pregenerating a map, no command stuff to deal with, configurations, just raw do-it testing... Memory was around 1.7gigs after a few mins running, and the stretches of 'generated n chunks in x time' between entity-count outputs was incredible compared to previous attempts. While running the server memory increased ever so slightly, often hovering and dropping back a little then moving again, and after about an hour was around 3.6gigs, then dropped down to about 3.3 gigs by the time i stopped the server after about 1h45min, and was 25% finished the job Clearly, the sponge pregenerator has to be keeping chunks loaded, is it not? Entity counts go up consistantly, never down, proportionally to chunks - they have to be entities in the new chunks sticking around in memory somehow... Since the pregenerator is an automatic task, there isn't really the ability to add extra code that runs between its intervals (though i guess a coordinated scheduler synched to run something in-between its runs maybe - and no way to get a list of the chunks that were just generated in that cycle in order to unload them separately...

I'd love a confirmation method to output the count of chunks loaded, to see if it really is remaining loaded, or if they;re being unloaded yet the entities stay somehow?

Per discussion in irc, I might try to dump entity coordinates on-spawning for a few minutes of just pregenerating a small diameter clean map, and then dump all the entity coords after pregen is done

Pastebin of the code used: http://pastebin.com/8tM40aPT

DDoS commented 8 years ago

The pre-genereator loads chunks in groups of adjacent 4 to ensure population takes place, but afterwards leaves them to be GC'd by the internal algorithm. The entities staying could be an issue with SpongeForge 1.10. Could someone try it on the last 1.9 build?

theboomer commented 8 years ago

Was going to post results of more testing here, but then noticed your 1.9 request.
I have made a 2 min video that shows the results of the testing, with visualizations of the server logs in case there is valuable information in the patterns seen for solving the problems.

https://www.youtube.com/watch?v=Sm16GCweoUo

End result= 1.9.4 final version has exact same behavior as 1.10.2.

When the server starts and generates the initial chunks, thousands of spawned entity locations are produced, yet when the dust settles, only 1/10th of them are actual entities remaining in the world - so there must be a lot of entity destruction going on -- most of the spawn events are bursts of multiple spawns at 'nearly' the same location, same x/z with a higher or lower Y, same y/z with a different x, etc. The multiple spawn events 'collapse' into very few actual entities in that process -- but in the pregenerating system, every entity spawn does NOT 'collapse', so not only are the entities still there in memory, but every-possible-entity in the chunks, including the 9 that dont actually exist for each 1 that does.


For completeness sake, I will run a 1.8.9 control test, and see if there is any information usable in any hidden pattern shifts, if the 1.8.9 is actually working correctly (the assumption it is), and as well, I'll do the spongevanilla as well JUST IN CASE its a forge quirk throwing something strange at it.

theboomer commented 8 years ago

spongeforge-1.8.9-1890-4.2.0-BETA-1620 behaves EXACTLY the same way as the others - tens of thousands of entities generated and remaining the same way.

spongevanilla-1.8.9-4.2.0-BETA-348 as well, behaves EXACTLY the same way, with the notable exception that its memory usage and usage increase is trivial to make a 4k diameter map compared to spongeforge versions.

I'm not testing the others, as there is no reason to expect them to be different, without any feedback to guide a reason to test or more parameters to observe, or specific old versions to test as comparison if there is a hypothesis that sponge optimization changes might have resulted in this problem case.

IBGnova commented 8 years ago

As stated above I don't think this is just a 1.10 issue. I started this issue report due to me trying to pre-gen my world on 1.8.9 (Sponge Forge), and my server crashing constantly unless I turned it down insanely low.

DDoS commented 8 years ago

A recent optimization that was added is chunk neighbour pointers to speed-up some operations (faster than chunk lookups). Maybe the chunk neighbours aren't being freed correctly? Just looking at the freeing code, this line should be outside the if condition, as it should be executed unconditionally. Also there doesn't seem to be any mechanism to prevent a new neighbour pointer from being created after the chunk unload callback has been invoked.

@amaranth you implemented this, right?

IBGnova commented 8 years ago

It's been almost a month now how long do fixes normally take? I don't meant o be rude just genuinely curious.

bloodmc commented 8 years ago

@IBGnova did you test with watchdog disabled? set it to -1 in server.properties

IBGnova commented 8 years ago

I don't see that as an option.

theboomer commented 8 years ago

IBGnova - good news, after a very difficult and confusing back and forth testing some optimization versions the past two days with blood and me driving him crazy while he was making me feel like I was crazy, the crazy sunvagun blood identified that the cause was conditional entities not cleaning up in chunks generated with no one on the server (as many of us do pregenning a huge world).

As a good note, indirectly this problem was dramatically reduced via other chunk optimizations: Recent patches to 1.10.2 in the past weeks (I dont know which version became the flipping point) resulted in collapsing of the insane-amounts of entities down to the 'real' entities that would be in the chunks as it is already, as recent tests still had this behavior but producing tens of thousands of entities in timeframes where a hundred thousand were before, so recent versions will have stood a decent chance of getting to 10k diameter or so before struggling. But he has committed some code that take care of cleaning up the rest of the entities during pregenning now.

IBGnova commented 8 years ago

This is amazing, thank you guys for all of your hard work, it really means a lot. This fix will without a doubt make a lot of bigger servers (or all servers :P) quite happy. I do not know much about coding (at all), but will this effect take place in plugins utilizing this system automatically, or will they have to update?

Also, there seems to be another issue with world generation, but I am not sure if this is something linked to you guys directly or not. For some reason I have noticed massive drops in TPS sometimes going down to 6-8 at times when we had very few players online (20-25). At the time I just brushed it off until it became a real concern and issue, so I started to TP to people around my world and I stumbled across a player that was in a jungle. The player was dead center in the middle of a jungle and right next to him was a lava source, and when you got next to this lava source the server would drop in TPS as stated above. After insane amounts of testing with my dev we found out the lava sources only lag if they are world-genned (Not sure if it is linked to pre-genning in any way) in a specific way, the requirements for the lava causing lag are listed below:

It has to be located IN a jungle.

The source or the block flowing out of a wall directly next to the source has to see sunlight (Does not work through leaves).

The source HAS to be created with world-gen or pre-gen (This does not work if you place a lava source down).

This issue seems to be linked to an issue with a lighting glitch, although we are not 100% if that is the cause. I was not sure if the cause might stem from something sponge is doing with world generation or not, I have already sent blood this info although I am not positive it is GP causing it, so I thought I might as well post it here while I have your guys' attention. :P

If you would like me to make another post with all of this information I will do so.

Thanks, Nova

theboomer commented 8 years ago

I can attest to the fact that what blood has been working on in the past many days has been a buttload (though I dont know if thats an imperial or metric buttload..) of optimization methods regarding chunk loading and lighting and updating -- In some of his test versions, I've tped into the dead center of an ungenerated jungle, and been able to fly through it quickly while generating chunks the entire time, and not only was tps unphased, the actual time spent on activity in that tick was dramatically reduced. Many of the changes he is doing are all intertwined together to greatly optimize stuff and make as much stuff asych as possible, and in the end all these parts acting together will dratically impact a lot of lag issues and lighting-based issues hopefully.

I think you'd need - as always - to try newer and more recent builds of the server files to see if any fixes have directly or indirectly addressed the problem you see; if you 'can't' upgrade to a given version yet because of various mods or plugins that wont let you upgrade to it, you at least need to identify if the issue is mod related or addressed in the recent builds, as if they resolve the problem, there is nothing for the sponge devs to do. Otherwise, yes, a separate issue specific for that - and worth checking if its a reproducible (any lava sources in any jungles on your map) or "just that map is screwed up" since you know what you're looking for.

Regarding the worldgen stuff: All the plugins I've seen for pregenerating chunks in sponge use the exact same built-in generator, and most of them use the same built-in generator default settings unless you override them with specific settings (and you'll probabably read warnings from the plugin authors to just let it work with the default values) The pregen is a closed blackbox system that just needs any non-default settings put into it, the world and border sizes to be specified, and it takes care of everything internally so every plugin is just a 'command interface' to launch this code, so upgrading the servercore jar doesnt require modifying the plugin to get the new generator process working.

IBGnova commented 8 years ago

"Otherwise, yes, a separate issue specific for that - and worth checking if its a reproducible (any lava sources in any jungles on your map) or "just that map is screwed up" since you know what you're looking for."

I have spoken to another server owner that seems to be having the same issue with lava in jungles, so I highly doubt it is just the map I am using, or something within the map that is causing it.

ST-DDT commented 8 years ago

Mind sharing the seed and the locations (and all other nessesary setup data) so this can be reproduced?

IBGnova commented 8 years ago

I am trying to reproduce it again now, I will edit this with the info once I get it.

IBGnova commented 8 years ago

After looking for about an hour and a half I have yet to find any lava on the surface of any jungle. If one of you guys would like to have a crack at it here is my seed: 1983355873225757945 and here is the website I was using to find the jungles: http://mineatlas.com/?levelName=1983355873225757945&seed=1983355873225757945&mapCentreX=-58392&mapCentreY=54476&mapZoom=17&pos=&Player=true&Spawn=true&Likely+Villages=false&Ocean+Monuments=false&Jungle+Temples=true&Desert+Temples=false&Witch+Huts=false&Slime+Chunks=false

If you find some lava on the surface make sure to not update any blocks/light around it (that is when it stops causing lag).

theboomer commented 8 years ago

-- SpongeForge, SpongeVanilla -- Build versions -- Plugins and mods installed, and any similarity to your other owner-friend ("We both have pixelmon" ) etc

Can you go back in time in your mind to when you were trying to investigate this jungle ("it was either the thursday or friday...") and recall the player you were addressing? You can then open the logs from those days and with a good editor and search command, likely find your lava investigation, and backtrack a bit to find your tp coordinates....


Was it a single, solitary lone source lava block sitting on the surface like aliens planted it there with no flow from it? Was it a lava pond with multiple sourceblocks? Was it a solitary source block with flow around it?

Do any of your pixelmon mods or plugin things do something like destroy a mob with a block of lava or such for a visual effect, and it is lava that has been placed there that is not part of the natural terrain?

IBGnova commented 8 years ago

-- SpongeForge At the time it was 4.2.0-BETA-1666 -- Forge latest 1.8.9 version. -- Plugins and mods installed: The only mods/plugins that I could think that would cause this would be: Pixelmon Sponge Grief Prevention Project Worlds Project Borders

All of the above I have in common with the other server owner.

I tried going back into all of my logs from the day of, but I can't seem to locate any info on the incident.

The only thing about the lava is, the ones we saw were coming out of walls in the side of hills. They were one block sources, and they took a while to start flowing (5-10 seconds), if you place a block near them or under the source block it will stop lagging.

IBGnova commented 8 years ago

"Do any of your pixelmon mods or plugin things do something like destroy a mob with a block of lava or such for a visual effect, and it is lava that has been placed there that is not part of the natural terrain?"

I don't quite understand the first question, sorry. The lava is naturally spawned in with the terrain.

IBGnova commented 8 years ago

Any more news on this issue?

marvin-bitterlich commented 7 years ago

Can confirm this is still an issue with spongeforge-1.10.2-2092-5.0.0-BETA-1750.jar

Aaron1011 commented 7 years ago

Is this still an issue with @dualspiral's pre-gen refactor merged?

Yorkforce commented 7 years ago

It does still seem to be an issue on Edit: spongeforge-1.10.2-2221-5.2.0-BETA-2202.jar

Generated 2 chunks in 0s 089ms, 41% complete Overworld, Generator: BIOMESOP, Entities: 2834, Loaded chunks: 8527

After running and re-running pregen it appears that newly generated chunks stay loaded, which in turn keeps entities loaded until the server is restarted.

bloodmc commented 7 years ago

This is no longer an issue. @Yorkforce you wont receive any support running outdated builds.

Yorkforce commented 7 years ago

@bloodmc Sorry, was meant to write Spongeforge 1.10.2-2221-5.2.0-BETA-2202 (was very tired)

We are unable to use pregen at all right now because of the above issue. Once it hits about 40% the TPS plummets to around 1 and it takes around 18 seconds to generate 2 chunks irrespective of the above timing report, I don't think it would reach 50% before the server dies.

For more information we are trying to pregen a BiomesOP world with a 10k border on Direwolf20 modpack with no players online (new world), I expect everyone would run into the same problem, which makes things difficult as sponge support forums advise to pregen worlds to ensure the server doesn't lag, which we had noticed it does with just 2 admins online.

dualspiral commented 7 years ago

@Yorkforce This is what I have tried to fix in my refactor that @Aaron1011 asked about, or at least provide the hooks to do so, in API 6. Because of the fact that API 5.x is non-breaking, this hasn't been backported. If the Sponge team would like this, I may see if I can backport anything (it'd probably only be the events and the ability to skip chunks), but it'd be limited, if I am able to.

We've discussed the fact that current version of Nucleus Mixins don't work on your setup due to the number of mods - I'm keen to try to fix it there too, but that's nothing to do with Sponge and I'll deal with that elsewhere.

Yorkforce commented 7 years ago

@dualspiral Ah I see that makes more sense, thanks. I was searching github issues for the same/similar problem and noticed this issue which sounded pretty much like what we were experiencing. I'll keep my ears to the ground on Nucleus. Edit: and I just noticed this is the SpongeCommon github, time for that coffee I think ;)