Zylann / godot_voxel

Voxel module for Godot Engine
MIT License
2.57k stars 240 forks source link

Multipass generation #545

Open Zylann opened 11 months ago

Zylann commented 11 months ago

This issue is about introducing a generator that can access neighbor chunks, in order to facilitate generating trees and small structures that cross chunk borders in blocky terrains. In practice: accessign neighbor chunks while generating a given chunk (and oh boy this is a rabbit hole).

I am still experimenting with this so nothing is set in stone, and I don't have an ETA. As more people bump on the problem of trees, I kept running in circles myself trying to find better aproaches for a while now, so I'm creating this issue to list in depth what I have so far.

Some notes about terminology in this post

The problem

Many probably bumped into generating trees and other small structures across chunk borders in blocky terrains. Indeed, currently it is not possible to access neighbor chunks from inside VoxelGeneratorScript.generate_block. The reason is mainly multi-threading, and simplicity. A terrain just requests generating every cubic chunk (aka "block") around players, then every chunk is sorted, distributed across several threads, generated and returned.

I wrote a section in the documentation to explain the limitation in more detail: https://voxel-tools.readthedocs.io/en/latest/procedural_generation/#handling-block-boundaries-with-voxel-structures This contains a workaround approach based on determinism, chunk coordinate hashes, and shortcuts taken by assuming the base terain uses 2D noise. It was implemented in this demo. While effective, it seems very few people I explained this to managed to understand that approach. It's not intuitive, requires some math/noise knowledge, and gets limited the more complex base terrain generation is.

The same article describes those pitfalls, and then tries to define another approach where neighbor chunks would actually be accessible. Ironically, this is actually even more complicated, potentially slower even. But if it was handled by the engine somehow, it could lead to a more intuitive API that "just works" as long as you don't wander too far away from the generated chunk.

Introducing multiple passes

A solution to this would be to split terrain generation into multiple passes. Each pass would be able to access results of the previous, including neighbors, which can simply be written to in case of overlap.

Chunks would be tagged with a level, corresponding which stage of generation they are at (a chunk that was processed by pass 1 and 2 will be at level 2). Chunks would be considered generated after they have gone through all the passes (but we'll see it's not as simple as that).

For example, for supporting trees, we could have the following passes:

Most of the time, 2 passes will be enough, but we can also imagine having a 3rd pass for lighting (though that can be explored later).

Chunks lifetime

The fact we can now have partially-generated chunks and extra passes running on them means they have to be stored in a map somehow.

For now it is not specified whether this map is the same as the main storage of VoxelTerrain, or a different map (see Implementation section), but it may be important to note that chunks that have not finished generating should NOT take part into gameplay. For example, game logic that runs next to the border of the loaded map must not be allowed to modify voxels of chunks that are not fully generated. In Minecraft, game logic actually stops in phases the closer we get to map borders.

Conversely, chunks that have finished generating must no longer be touched by neighbors finishing their generation process themselves.

Overall it means chunks begin their lifetime as a partially generated chunk not taking part in gameplay, then they transition into gameplay and stay there until being edited, eventually saved or reset.

API

The kind of API scripters would implement could look very similar (pseudocode):

extends VoxelGeneratorMultipass

# The number of passes and how many neighbors they can access
# can be defined with some properties.

# Will be called multiple times per chunk, with incrementing pass index
func _generate_pass(voxels: VoxelToolMultipass, origin_in_voxels: Vector3i, block_size: Vector3i, pass_index: int):
    if pass_index == 0:
        # Base ground
        for relative_z in block_size.z:
            for relative_x in block_size.x:
                for relative_y in block_size.y:
                    # Do something to every voxel...

    elif pass_index == 1:
        # Trees
        # Find ground to plant trees...

VoxelToolMultipass isn't a real class yet, but the idea is to have a kind of object similar to VoxelTool that is allowed to access voxels in the generated chunks AND neighbors, up to a limited distance (because otherwise it becomes very difficult to control this in a multi-threaded environment and know when a chunk actually finished generating). Those neighbors would be in a partially generated state, having at least been processed by previous passes.

It would still process chunks one by one. The fact it can modify neighbors does not mean those neighbors must be fully processed in the same call, or can be considered processed. They are only available to allow "partially" overlapping content into them, as is inevitable with structures and lighting. They will still get processed too in another call to _generate_pass, which interestingly can also have access to the previous chunk as a neighbor. In practice: dont plant trees in neighbor chunks, but trees you plant in the main chunk can overlap with neighbors.

Side-effects

As you might have guessed, being able to access neighbors comes with several pitfalls.

Inter-dependency

Let's say we have the 2 passes for Base Terrain and Trees described earlier. We could think that in order to generate 1 chunk (E below), we only need to generate this set of chunks:

Chunks:   Levels:
A B C     1 1 1     1 -> chunks processed by pass 1
D E F     1 2 1     2 -> chunks processed by pass 1 and 2
G H I     1 1 1     

But remember, pass 2 can modify its neighbors when it spawns structures overlapping borders. That means when we start generating chunk F, it can attempt to modify E as well... which means E cannot be considered generated yet, even though it has gone through pass 2.

So instead, to guarantee that E is fully generated, we have to process E AND all the neighbors that can access E. We could think about requesting neighbor chunks too, so we can consider the middle one complete, which cumulated results in this pattern:

Chunks:       Levels:
J K L M N     1 1 1 1 1
O A B C P     1 2 2 2 1
Q D E F R     1 2 2 2 1
S G H I T     1 2 2 2 1
U V W X Y     1 1 1 1 1 

However, while intuitive, this is still not enough, because this idea would only cover inter-dependency in the last pass, and not the passes it depends on. Let's consider the following 3 passes to generate chunk A:

x x x D x x x     1 1 1 1 1 1 1     Pass 1: base, no extents
x x x C x x x     1 2 2 2 2 2 1     Pass 2: trees, extents 1
x x x B x x x     1 2 3 3 3 2 1     Pass 3: light, extents 1
x x x A x x x     1 2 3 3 3 2 1
x x x x x x x     1 2 3 3 3 2 1
x x x x x x x     1 2 2 2 2 2 1
x x x x x x x     1 1 1 1 1 1 1

Here we had to process light in A and all chunks around A because that's the nature of how light must be processed. But let's consider chunk C. Light in chunk B is now processed and can have made a short round trip to C, via a hole in a wall for example (and thats why pass 2 has extents 1). However, C's neighbors have not all gone through pass 2 (trees, also with extents 1). So when D enters pass 2, it can spawn a tree that extends through C, which can block the hole light has gone through earlier. The result is that light in chunk C will be incorrectly generated, since it will remain leaking through a hole that no longer exists. Ideally, once light starts spreading into a chunk, no more changes should occur in that chunk.

This is only an example and the same scenario can occur in any kind of chunk processing that involves neighbors, with various consequences depending on what the pass is doing.

Therefore, for a pass to run in a given chunk, we must be certain that the chunk it modifies cannot be further modified by neighbor chunks of the previous pass. We only allow this within the same pass. Here is the actual pattern to guarantee this:

a a a a a a a a a    a = pass 1
a b b b b b b b a    b = pass 2 incomplete, partially affected by itself
a b B B B B B b a    B = pass 2 complete, partially affected by c
a b B c c c B b a    c = pass 3 incomplete, partially affected by itself
a b B c C c B b a    C = pass 3 complete
a b B c c c B b a
a b B B B B B b a
a b b b b b b b a
a a a a a a a a a

And now you might start to see why multipass generation can be more expensive than the initial hack with deterministic hashes.

Determinism

Another pitfall affects determinism. If your world generation is based on a seed, you usually expect to get the same world if you re-generate it.

Consider 2 neighbor chunks A and B. If pass 2 can modify neighbors, that means A can modify B. And then later, B can modify A... but the order in which this happens is unpredictable, due to multi-threading and player movement.

That means you have to be careful when writing into neighbor chunks, if you overwrite log voxels with leaves for example. To be reproductible, chunk processing order must lead to the same outcome. For trees, you could have rules in place to prioritise voxel types over others. For light, spreading order naturally doesn't matter... as long as it spreads through chunks that definitely won't change later in the generation process.

This is not necessarily a big deal, it might depend on how you generate your chunks.

Limitations

Multipass generation still has its own limits.

1) In practice, it will only work best for blocky worlds similar to Minecraft. Structures that are parts of terrain voxels is much less common with smooth terrain, where separate model instances are preferred instead.

2) It will not support LOD. In fact, generators that directly support LOD only do so because of a specific design which is to avoid storing voxel data, generating it on the fly instead based on procedural shapes (which is implemented in VoxelLodTerrain). Executing all passes at multiple LODs sounds extremely expensive. Besides, nobody so far ever implemented LOD directly in blocky world generators. If LOD ever comes to this kind of terrain, it will likely work differently than in VoxelLodTerrain.

3) It can remain expensive. It is very easy to spam passes and increase neighbor distance, for easy coding and not needing procgen math hacks, but that can put enormous pressure on the generation process, to the point the area players can actually see is just the tip of the iceberg that partially-generated chunks are. For the same reason, multipass generators might not be the best solution for large structures such as cities and dungeon networks sprawling dozens of chunks.

4) Cubic chunks. While that doesnt sound like a limitation at first, keep in mind that checking neighbors in 3D is far more expensive than doing 2D checks for full colums of chunks. Checking 1 neighbor in 2D requires 8 map fetches. 3D requires 26. And it ramps up really fast as more neighbors need to be checked (x^3 instead of x^2). Examples so far were in 2D for simplicity, but by default, the voxel engine works with cubic chunks.

Column passes

As an extension to point 4) of Limitations, some (or all?) passes could work on columns of chunks instead of single chunks.

Doing this has several consequences and advantages:

I dont know how this can be mixed with 3D passes. It's also possible that this kind of generator would entirely work column-based, and not actually do anything in 3D, because in practice that's the most common use case.

Also, despite the claims, not limiting height and keeping streaming be cubic-chunk based still makes things really hard, because lots of edge cases and inconsistencies come to complexify the task so much more than if everything was just columns with limited height everywhere streaming in 2D like Minecraft.

Large structures

As listed above, generating large structures such as cities spanning dozens of chunks seems like it would overwhelm the passes approach. One significant reason is that cities could be rare. In Minecraft, villages only spawn occasionally. But if they were generated with such a pass system, we would need a pass that can access up to 8 chunks away, which is WAY too much, because that would mean ANY chunk would ALWAYS generate 8 chunks around them just in case a village could possibly alter them.

So instead, an alternative is for the generator to keep track of a separate data structure with larger chunks. Such chunks would then store where large structures are, regardless of voxels of the terrain. Maybe the only conditions these chunks would check are biomes (seas being a biome too), which are also procedural and don't need voxel access. The generator would then figure out how to place pieces of the large structure in a local manner, rather than trying to place the entire thing down in one go?

This needs more experimentation, and sounds like this job is up to the user to implement. It can be fairly game-specific as well. I have some ideas how to handle this, but it will be for another time.

Streaming

So far we've mostly described the logic that occurs in memory... and indeed, keeping all partially-generated chunks in memory over a potentially larger area than the playable area is going to increase memory usage a lot. If this is not streamed, it will very quickly make the game run out of memory. So just like regular chunks, we also have to save them when they get far away, and load them when needed.

This change has a lot of implications, because so far generators were thought to have no state and being able to generate a chunk anytime. Having to stream cached data throws this all away.

Save format extension

First, where do we save this?

It sounds like we could save these chunks into the same stream as regular chunks. After all, they are very similar, with the only exception that they have a new property on them, their "pass index", telling what their generation level is, and by extent, whether it has completed or not. Minecraft does something similar.

Adding extra data means the save format has to be extended, and existing saves will have to remain compatible somehow. We could assume that old saves can never contain chunks that are partially generated.

What to load

Considering a 3-pass generator, we saw that to get a complete chunk, we need access to 5 neighbors outwards. That means for a view distance of 16 chunks, the game has to load 21 chunks ahead instead.

To recap, here is a schema of what the world would look like, after spawning and moving a little:

multipass_generation_and_streaming

Saving is actually required

And then I realized something: if we don't have a stream to save the data, multipass generation will break.

How so?

This is because by default, if we have no stream, the terrain simply unloads and forgets chunks that are too far from the viewer. This is not a problem for single-pass generators, they don't need to access neighbors. They just need a chunk position and they can give the result anytime.

But with multipass, imagine we generate the spawn area. That will normally work, and we will be able to move away for a little while. But then... if we retrace our steps, the game will face quite a dilemma, when having to generate again chunk X and a whole area around it has been "forgotten":

- - - - C C C
- - - - C C C
- - - X C C C
- - - - C C C
- - - - - - -

We could think that the process could start all over again, generating each passes around the chunk in a large radius. But what about C chunks? These were already generated, and could have been edited by the player. We don't expect them to be touched. But if we just allow X to generate as usual, it will attempt to modify these chunks again, because of the possibility to modify neighbors. That means potentially duplicate structures at the boundary, or lighting bugs.

This problem repeats when we have an existing world that didn't use multipass generation, and gets played later on with an updated version of the generator that uses multiple passes. It could even happen if the game gets shut down or crashes before it finishes saving the map.

I haven't experimented this enough to tell more how it would break in practice, but it certainly sounds like things can go sideways in many confusing ways. I'm not sure how this would be handled at the moment, apart from just letting things run at the risk of modifying existing chunks.


Implementation (WIP)

There can be a new generator class, VoxelGeneratorMultipass, which has a generate_pass(...) function that will run for every pass of every chunk.

Below are some notes of what I tried and what I'm planning to try.

Naive nested tasks

The first implementation of this is relatively simple, but quite naive. At the moment, I don't handle columns yet.

A map of partially generated chunks is created. Each chunk is also tagged with a level, which indicates which passes have run on them. To allow multi-threading, that map is protected with a mutex and a spatial lock, similarly to the main voxel storage map of VoxelTerrain. The maps are separate, exploiting the fact that partially generated chunks and "active" chunks in the game should never interact.

The terrain still generates chunk by chunk, by spawning one threaded task per chunk. The difference is how that task runs:

Overall, this is pretty much like classic chunk generation, with recursive generation of dependencies when needed, except dependencies are spread out across multiple threads instead of being directly run into the same call stack (which would end up single-threaded).

Problems:

Pyramid approach

An alternative I was thinking about, was to extend the way VoxelTerrain handles chunks to load and unload around viewers. Currently, it uses a box centered on viewers. Each time that box moves, VoxelTerrain calculates the difference between the new box and the previous box, which gives which chunks to load and which to unload.

Instead of having a single box, have multiple concentric boxes, each corresponding to a generation pass. The result of this, is that chunks will tend to get loaded in a pattern very similar to examples seen earlier:

1 1 1 1 1 1 1
1 2 2 2 2 2 1
1 2 3 3 3 2 1
1 2 3 3 3 2 1 
1 2 3 3 3 2 1 
1 2 2 2 2 2 1
1 1 1 1 1 1 1

However, each chunk can still take variable time to generate, so every tasks spawned to generate chunks still have to check neighbors, but this time maybe they don't need to spawn sub-tasks and can just wait a bit, since it is known that the viewers boxes would make dependencies eventually generate.

I haven't tested this approach yet so it's just an idea at the moment.


I'm still testing things around at the moment. Keep in mind that I'm trying to fit this in a way that doesn't create too much specific code in the engine, and doesn't change too many things at once. The downside of that, is that I'm making compromises that I also don't like much, but the alternative would mean to work extra weeks/months on this while also breaking compatibility everywhere (and adding bugs), all of which not being applicable to VoxelLodTerrain at all. There could also be different approaches I havent explored yet, I only listed those I thought about at length.

Zylann commented 10 months ago

First usable implementation is now available in the multipass_generation branch. It will be merged some time later. Docs here: https://github.com/Zylann/godot_voxel/blob/multipass_generation/doc/source/generators.md#multi-pass-generation-with-voxelgeneratormultipasscb API here: https://github.com/Zylann/godot_voxel/blob/multipass_generation/doc/source/api/VoxelGeneratorMultipassCB.md

Zylann commented 10 months ago

Now available in the master branch.

lavalyynx commented 1 month ago

Hey Zylann, not sure if this is the right place for my idea, but here it is: if I understand correctly, in the default cubic generation, a large chunk box (eg.: 12x12x12) is loaded around the viewer. But what if the generator only loads the chunks with a surface in it (contains AIR and SOLID or WATER)? One terrain surface chunk would load all 26 surrounding ones, and the ones with visible blocks in them will be added to a loading list. Interiating every frame on this could spread load the surface area around the player in a given radius. zylanns_voxel_example_spead2 Multipass generation: Only the surface chunks need a second pass. Tall tree example: A (ground) affects B(full AIR), but no the other way around. The border chunks will have to wait in a list until they get the full second pass. Result: GREEN = 2 pass complete - ORANGE = single pass - BLUE = waits for full 2.pass zylanns_voxel_example_spead_final1 Does this make sense? Could I somehow setup threaded loading of a custom chunk Array in godot? I might experiment with this sometime.

Zylann commented 1 month ago

But what if the generator only loads the chunks with a surface in it (contains AIR and SOLID or WATER)?

The generator can't magically do that because often you don't know that until you have actually done generation calculations. And with caves or other carvings having to be done in subsequent passes you pretty much can't count on these assumptions. Multi-threading is also very likely to throw a wrench in there, as when you start introducing random access patterns, it becomes extremely easy to end up with deadlocks, bad timings or single-threaded situations. And even then, generating only "visible" chunks is a radical change that only works if your game is designed from the ground up to expect that chunks within the player range will be empty. That is not just about the generator, but the streaming system. The generator doesn't decide what to generate, it is requested to by the streaming system. As for multipass on that... I already discarded 3D multipass because it's too annoying for use cases I considered. The idea sounds a bit interesting (similar concept as cave culling, except that's a rendering thing and not affecting the very chunks themselves), but it sounds like a real headache trying to retrofit that into the current system. Also what do you do when there is a floating island? How do you know it's there after having "followed" surface chunks from the player, but none of them "touch" the floating island?

You may have to create your own generator if you want to implement this specific idea for your game. The script API doesn't expose task scheduling, so you currently have to implement it with a module. And if you want to modify the way chunks get requested, you have to do even more changes which I doubt will be re-usable.

lavalyynx commented 1 month ago

I agree that this is very case specific, but for a game with terrain similar to Minecrafts it would be conceivable. Rather uncomplicated caves should be possible in the first pass with combined 3d noises. Oh, Now I understand the steaming system, that makes sense, I thought the VoxelViewer decides what to load in the background. I'll start my project with cubic loading, so I can eventually implement my idea into the module afterwards.

If the world accidentally creates a floating island which is +16 blocks above the surface, it will stay invisible and the player hopefully doesn't randomly discover it. Intentionally created islands in the sky could be found by loading a plane of chunks above the player. But yeah Multipass would be very tricky to setup.

Anyway, thank you for sharing your expertise and thoughts about this! This module is awesome :)