micdoodle8 / Galacticraft

An advanced Space Dimension Mod for Minecraft
Other
615 stars 333 forks source link

Block updates queued by oxygen sealing/unsealing can destroy newly moved blocks #1468

Closed perkinslr closed 9 years ago

perkinslr commented 9 years ago

ThreadFindSeal, which is invoked in TickHandlerServer and by TileEntityOxygenSealer.updateEntity, queues updates to the world for the next server tick. If, between when these updates are queued (where 'event.phase == Phase.END' in TickHandlerServer) and when they are applied ('event.phase == Phase.START', the next tick), something else moves or places a block, then the newly placed block will be immediately removed by TickHandlerServer. I've so far only encountered this with turtles/OCrobots moving, but I suspect remain-in-motion frames or similar things may cause issues as well. To duplicate it, simply create a simple program that has the robot run in circles, then trip frequent updates to the oxygen sealer stuff. Expect it to take a while, but the robot should simply vanish (I've seen it happen 3 times in over a week of having a robot run around).

As a quick-and-dirty fix, I patched ScheduledBlockChange to track what the block was when the update was queued, and have TickHandlerServer verify that the block is still what it was when the change was requested; but, due to other changes I've made, I can't easily generate a pull request and I doubt I matched your coding guidelines anyway, but if you want a patch file I can probably provide one.

radfast commented 9 years ago

Thanks for this, it's always appreciated when someone who knows about coding looks at our code.

I see the issue, I guess I hadn't anticipated that any blocks could change in that interval but I can see that other mods can also do stuff in that tick end / tick start period.

Performance is important here as the sealer is already a performance hit if it seals or unseals a very large volume all at once.

If you could post a patch file as a Gist maybe, I think that would help me.

perkinslr commented 9 years ago

Alright, I stripped the changes I made which aren't related to this issue (I made O2 use depend on living entities in the area and a couple related stability fixes), but I think the patch should apply cleanly.

ThreadFindSeal.java https://gist.github.com/perkinslr/5c812814649c32b5274a

ScheduledBlockChange.java https://gist.github.com/perkinslr/4d50a62bdbb516c0083b

TickHandlerServer.java https://gist.github.com/perkinslr/d1c7975dddcb8d4e5466

radfast commented 9 years ago

Thank you very much, I will look at all those and get them into the codebase at the next opportunity (probably not till the weekend)

I made O2 use depend on living entities in the area and a couple related stability fixes

That sounds like a nice idea! Galacticraft being an open source project, any improvements like that are very much welcomed if you care to make a Pull Request some time...

radfast commented 9 years ago

I implemented your suggestion and credited you, thank you very much. In fact the approach I took (for slight performance improvement) is not to store the current block type with ScheduledBlockChange as it will always be some type of air block, and I think it is safe to replace air with a different type of air, so in TickHandlerServer it just tests for air.

At the same time I added code to spread out the airblock changes over multiple ticks if there is a ridiculous number of them like 1,000,000 or something. That spreading now seems 'safe' to do because there is the check for an air block before replacing anything.

I see you added performance buffs to ThreadFindSeal, which is always good.

Any coder who appreciates that World.getBlock() and World.setBlock() are slow operations in Minecraft - especially setBlock() due to the lighting updates - is a good guy, in my book. If you have anything else at all to suggest for Galacticraft, go right ahead :)

I'm interested in the O2 use idea, seems like it could be good for roleplay if a sealed space stays sealed for a while after a sealer runs out of O2 or power, but if there are living entities in there (or torches?), that uses up the oxygen. I can see that oxygen use can be added fairly easily in the code which checks whether living entities can breath.

perkinslr commented 9 years ago

Yeah, I ran out of time this weekend to clean up the code for the O2 use, maybe mid week I'll have some time.

I thought about just doing an air check, but I didn't know if ScheduledBlockChange was used for anything other than air blocks, and didn't want to preclude such a use in the future. I suppose if that ever comes up, it would be trivial to include a boolean to determine if a full check is needed.

I do have it so that a sealer running out of O2 doesn't immediately unseal, nor does breaking the sealer block. I couldn't find where the check to see if living entities can breathe is (I didn't look very hard), so I just hacked around it, it's one of the things I need to fix before creating a pull request. I basically use the block metadata to track O2 levels, when they reach -1, instead of setting the block metadata, it sets it to a normal air block and starts causing damage. Since, without fans (out of power), the air on a station won't circulate much, I think it's an acceptable approximation. During the O2 sealer check, it counts how many blocks get updated and divides that by the number of active sealers, and stores that as an attribute on the sealer, it also counts lit torches as always needing a full 16 O2 units. Each sealer decrements its stored O2 on the sealer's update tick. I also made it so that the end of server tick one will handle air spaces larger than 2k, it will actually handle arbitrarily large volumes of sealed air, because as it goes through the area, every full O2 block it finds (metadata 15 breathable air), it increases the value of checkCount. In this way, if it hits something that truly isn't sealed, it will quickly find enough vacuum blocks to stop, but if it is actually sealed, it will properly verify the whole area. (I was having problems with it unsealing for short periods of time when I got an area larger than 8k blocks).

Oh, I also made it so that air locks opening give a free air block if the air next to them is safe, basically they look for an adjacent air block, if they find one, they use the same type of air to replace the air lock seals, if they don't find one, they default to non-breathable air. While this could be abused, without it someone who has airlocks that open on 10m approach will be damaged by running through their station without air tanks.

Somewhat related to that, I intend to add explosive decompression effects if I can get them working, without tanking performance. I should be able to have breathable air blocks push things towards vacuum blocks, and then spread the breathable air blocks until the sealer thread removes them. Out of curiosity, why did you make the standard air block non-breathable and add breathable air instead of adding a vacuum block?

radfast commented 9 years ago

I was having problems with it unsealing for short periods of time when I got an area larger than 8k blocks

This is a problem others have reported, I have not figured out the cause - it should not be chunk loading (because unloaded chunks are always seen as 'solid' blocks in this code), I have wondered if it is sealers which are too widely spaced so they don't always "see" each other.

Sounds like you have a great solution to all of this, and if you're interested, we could certainly include the whole thing in Galacticraft as long as it's looking OK in performance terms. No urgency, Galacticraft has been how it is for roughly 10 months now.

I'd want to look to look at "arbitrarily large volumes of air" to make sure there can't be an endless loop or out of memory issues etc.

I intend to add explosive decompression effects if I can get them working, without tanking performance. I should be able to have breathable air blocks push things towards vacuum blocks, and then spread the breathable air blocks until the sealer thread removes them.

I was thinking about exactly that, when I was working on this code last year. It would be very cool.

Out of curiosity, why did you make the standard air block non-breathable and add breathable air instead of adding a vacuum block?

By default, any new chunk will be filled with "standard air" - BlockAir, block value 0 - except for the terrain. So it's basically BlockAir from height 64 to height 256 on planets, that's 49,152 blocks of BlockAir per new chunk, more in a Space Station dimension or asteroids. Three problems with replacing all that with BlockVacuum. 1. It's quadrupling the work of the chunk generators in Galacticraft, which would have a performance hit. 2. It would increase map storage space on a server - a vertical sub-chunk (16 y values) filled with BlockAir basically takes little or no space in the map storage, anything else does. 3. The vanilla game engine does various fundamental calculations (including terrain height, lighting, and rainfall, probably also other things like mob spawning and villager movement) based on finding the highest y value in the chunk which has a block which is not BlockAir. I wouldn't fancy messing with that.

radfast commented 9 years ago

I thought of a 4th thing: if the player is above height y=256, the vanilla game engine will I think always return BlockAir if you call world.getBlock().

Because of all these things, it seemed easier to make BlockBreathableAir the exceptional case, as sealed spaces are generally small. This was a design decision taken by micdoodle8 early in Galacticraft 1. It would be difficult / impossible to change now in existing maps.

Either way, the code which tests whether EntityLivingBase can breath, whether torches can burn, and whether fire should be extinguished, needs to test the blocks every time so it makes no difference to that code which way round the BlockAir and BlockVacuum are.

Most of the testing is in OxygenUtil: https://github.com/micdoodle8/Galacticraft/blob/1.7/src/main/java/micdoodle8/mods/galacticraft/core/util/OxygenUtil.java

This is called, for example, when updating living entities: https://github.com/micdoodle8/Galacticraft/blob/1.7/src/main/java/micdoodle8/mods/galacticraft/core/event/EventHandlerGC.java#L230-256 and when updating the player: https://github.com/micdoodle8/Galacticraft/blob/1.7/src/main/java/micdoodle8/mods/galacticraft/core/entities/player/GCPlayerHandler.java#L517-636

radfast commented 9 years ago

@perkinslr If you're still interested in this, it would be great to hear from you.

For the future, I am thinking about a different approach. Stop using BlockBreathableAir in the world map. Instead, the server maintains a list (in a HashSet) of all breathable air blocks currently on the server, with also a time counter per block. The Sealer update checks "refresh" that list of breathable air blocks - the search algorithm would be exactly the same as now, instead of creating a list of air blocks to change in the world map it would update that HashSet. If the sealer switches off, the breathable blocks time counters will slowly reduce to 0. "Use" of the blocks by an entity or a torch will make the time counter tick faster. The amount of time refreshment which the sealer has to do, tells you how much oxygen supply an update tick has required.

A "leak" mechanic can rapidly use up air blocks close to the leak position, spreading out from there. (Maybe also adding a motion vector to any entities it finds - as you mentioned ...)

The leak position is tough to locate computationally, even in the simple case of a 1x1 hole in a wall. Did you have any good ideas about that?

Overall this would I think reduce server load from sealer checks, and also prevent "dramatic" consequences / flicker type behaviour if a sealer becomes unsealed for a short time (whether that's because of inadequate oxygen or chunk loading issues for its oxygen / power grid or some as-yet-unidentified issue in the code). See also #1381 and #1545

This would also solve a separate problem namely that some Bukkit plugins do not treat Breathable Air blocks as clear locations for teleporting.

perkinslr commented 9 years ago

I am still interested in this, sorry I've been busy with school (till Thursday).

The idea of using a HashSet is a good one, it would significantly reduce the server load during an update, and remove the possibility of a bunch of bugs. Also, it would let air levels in non-air, non-solid blocks (ladders, doors) be tracked properly.

The leak mechanic is pretty easy, have the HashSet track pressure and [O2] instead of just [O2]. Then, if an area is marked as unsealed via the usual mechanism, instead of clearing the HashSet entries, let them degrade as normal. This will cover things like is becoming 'unsealed' due to the removal of the sealer, or say a doubling of room size. On each tick, for each entry in the HashSet, check the coords adjacent to it, or better would be to check in a radius of 2 or so, and average the pressure between them. Any that drop below some arbitrary pressure get counted as 0, and any entities in the area get moved toward the low pressure area. It would probably need to track the positions of walls in the HashSet too, because they would need to not count toward the limit.

When an area gets sealed, it should probably bring the pressure up slowly, which would avoid a stuttering sealer causing explosive effects on players. Also, when a section runs out of O2, don't remove it from the hash set, since the pressure is still nonzero.

As a related idea, when inside an area that is sealed, sound should probably propagate normally, which would server as an audible warning if a sealer fails.

radfast commented 9 years ago

Great, I'm glad to hear from you. I'm also mostly busy until after Thursday too.

When I have time, I will change the existing code to a HashSet mechanic as I envisage. If that works out, it would be great if you would have a look at what enhancements can be added in the form of leak detection and player motion vectors.

This is not feasible I think, in performance terms - bear in mind that in a large room (e.g. 40 x 50 x 50) there can be 100,000 blocks.

On each tick, for each entry in the HashSet, check the coords adjacent to it, or better would be to check in a radius of 2 or so, and average the pressure between them.

Instead something is needed during the sealer check which identifies any vacuum block found next to a breathable block - and if the area is unsealed, that is the "leak" location. This should work, because (if things were previously sealed) the vacuum block(s) will be the block(s) which changed. Hmm, I think that could be fairly straightforward....

perkinslr commented 9 years ago

Noting down where the leaks are is probably a good idea, and we'll probably need to include in the config file an option to turn it off, since older machines are likely to struggle now matter how it's implemented. As for performance, I don't think it will be as bad as you think, the math should be expressable as sparse matrix operations, so assuming java has a decent BLAS library, it should be possible to handle fairly large areas quickly. Also, since it doesn't actually have to touch the block list anymore, it can do the entire calculation in a separate thread. Even if it ends up taking 2 or 3 server ticks before it begins actually moving entities, I don't think it will be noticeably incorrect. It would mean adding a BLAS library as a dependency, so I'll try to get it fast enough without, first.

radfast commented 9 years ago

it can do the entire calculation in a separate thread.

I like the way you're thinking.

You might have to educate me on a BLAS library...

Anyhow first things first, I will work on copying the current algorithm and in-game functionality but using the hashset approach. You can then maybe build from there.

perkinslr commented 9 years ago

Sounds good to me. By the way, a BLAS library is a Basic Linear Algebra System library, often vendor supplied and specific to a particular CPU architecture (there are some generic ones that do okay). They are highly optimised for matrix operations, so anything that you can write as a sum or product of matricies becomes fast to calculate. Because the interface is pretty well standard, you can swap one BLAS library for another without much effort (LD_PRELOAD=), but I expect java to be slightly less nice about it, probably need to do something with reflection to specify which library to use by name. Oh, and NVIDIA and ATI both provide GPU-accelerated BLAS libraries, so assuming Java can do it, it would be pretty simple to let the user specify GPU-acceleration for the calculation in the config.

radfast commented 9 years ago

hmm BLAS library sounds like a problem then, we cannot distribute platform-specific code, and we cannot assume that the player already has one installed for his platform.

Are you aware of any other mod which uses a BLAS library?

I'm all in favour of high performance / low latency code, but it may have to be achieved in the old-fashioned way, by good algorithms...

perkinslr commented 9 years ago

I rather doubt there are, the finite fluids mod might, it's the only other mod I know of that would benefit from it. As for platform specific code, there are some generic implementations that are not platform specific, their performance isn't quite as good as the vendor specific ones, but it is far better than any iterative algorithm can manage. I would ship it with one of those (looks like apache has one), which may even be pre-installed on most machines. Then, assuming the java interface is similar between apache's BLAS library and MKL and cuBLAS, at runtime or in the config, look for one of the fancier BLAS libraries and use it if it's available. That would let someone on an old computer still run it, with a bit more effort, and let someone with a new machine handle possibly hundreds of simultaneous decompressions.

radfast commented 9 years ago

possibly hundreds of simultaneous decompressions

Like a nuke went off or something? We don't have those in Galacticraft, but never say never ... (also my friend told me ICBM mod is now available in beta for 1.7.10)

I'm super happy if you want to experiment with a BLAS library once you have some free time, or any other contributions you want to make to the mod. But do bear in mind the potential difficulties of distributing or installing extra libraries with Galacticraft - first is to check licensing as we are obviously distributing to millions of players (although I guess most libraries would allow for distribution in projects, with LGPL or similar) - second is to be sure there are no installation issues even for a minority of players or a minority of platforms. If there are problems of this kind, then it would not be worth the performance benefits.

There are other areas where performance needs reviewing. There seems to be a memory issue on a client, like Galacticraft will add around 70-90MB which seems to me like too much for one mod, even such a big one as this - I am bearing in mind a modpack can have 50 mods, and anyone who is on 32-bit Java will have maximum 1GB allocated to Java, more likely 512 or 768MB in practice.

perkinslr commented 9 years ago

Unfortunately, I've not had much experience profiling java, so I don't know if I would be much use in tracking down where the extra memory is going. I've got free time now, so I'd be happy to help with things though. Once you have the sealer code pushed to an array instead of using blocks, let me know and I'll see what I can do to speed it up and handle decompression.

radfast commented 9 years ago

Hi @perkinslr

I made the array changes to sealers - actually using a HashMap - please see this: https://github.com/micdoodle8/Galacticraft/commit/dc5a335cf4bce41bc3dcfae2483af2dfb84e10d1

If you want to poke about with that a bit, please make a copy locally of the oxygen-alt branch and then make Pull Requests on that branch.

The changes I have made work. But, I have encountered the following issues:

On the plus side:

perkinslr commented 9 years ago

Alright, I'll take a poke at it. I'm fairly certain I can fix the performance penalty by creating a second list of intermediate values. Basically, when a living entity consumes some air, note the location of the O2 use in a short list, then only the items in the short list need updating during the next seal check.

The lack of instant response is a bit more difficult to handle, it would be possible to use reflection to watch calls made to setBlock, and note if the edges of any sealed spaces are modified, may be able to do something that way in GC's world provider, but otherwise it's a bit messy. The other possibility would be to set the edge blocks to special air blocks, which would still outperform the old algorithm, but would also not be ideal. I'll poke at it and see if I can figure anything out.

radfast commented 9 years ago

The seal check needs to check all blocks each time - like many other update functions in Minecraft - because blocks can change... Apart from perhaps trapping setBlock as you suggest, there is no way for the sealer to know whether the integrity of the room is still OK without checking them all.

Note: neighbouring block update triggers (the well-known BUD in Minecraft) work pretty well for BlockBreatheableAir, but are not 100% reliable because not every block change in the game will fire a BUD trigger - especially once you take into account other mods as well.

Trapping setBlock() could work - I have been thinking about that as well. I think the best approach in that case is make another array for the "crucial" blocks which form the integral edge of a sealed space, and if there is a setBlock() for one of those crucial blocks, then fire a sealer check. It's possible without too much performance hit ... setBlock() is not used that much unless an explosion is happening.

perkinslr commented 9 years ago

Aye, it doesn't remove the need to iterate over all the blocks in the area, what the second array does is remove the need to write to each item in the HashMap.

The basic sequence would be: iterate over the blocks in the area and calculate the volume sealed; iterate over the short list of where O2 was used, reset the O2 levels; decrease the sealer's stored O2 based on the total volume and the items in the short list.

This doesn't avoid the iteration, it only makes the iteration read-only, except when blocks have changed, and except for the few blocks with living entities in them.

Trapping setBlock is probably the most reliable method, and least damaging to performance, but I don't know if it would be sufficient. TileEntity blocks could switch from 'sealable' to not without triggering a block update, so we'd probably want to maintain a list of TileEntity blocks on the borders and check them periodically. Of course that would usually be a much shorter list.

radfast commented 9 years ago

There are no blocks which change sealed status according to something in the TileEntity.

There are probably a small number which change according to the metadata - the only one I can think of right now is the Astro Miner Base which has one part which is not sealed, and can be rotated with a wrench, but there may be others like that.

When an airlock opens or closes, it actually places blocks into the world.

perkinslr commented 9 years ago

I wouldn't be too sure of that, while neither vanilla nor GC add TileEntity blocks that are solid or not depending on the TE, that is not necessarily true for other mods. If they either implement IPartialSealableBlock, or if block.isSideSolid varies depending on TE properties, it could cause problem. Some mods use non ticking TEs to reduce the number of added blocks, if they don't properly manage the neighbour block change notification they could cause problems. Other than a periodic check, I can't think of a way to detect that, and I don't know if it'd be worth trying to work around that sort of bug in another mod.

radfast commented 9 years ago

Fair points. There should still be a periodical sealer check in that case, that would pick up those TileEntity / isSideSolid cases, and also cover other possibilities, I can think of:

The code I currently pushed has a periodic sealer check, it's around once every 2 seconds, so it works perfectly well apart from the fact it does not respond quickly to a hole in a wall.