norman784 / gaiku

3d agnostic framework (game engine) terrain engine
40 stars 1 forks source link

Is the voxel baker going too far? #47

Closed QuantumEntangledAndy closed 3 years ago

QuantumEntangledAndy commented 3 years ago

So I have been expanding the traits integrating density into the chunk system. Time and time again the voxel baker is giving me issues.

Consider this input data which is generated from noise ranging from 0m to 3m

0 1 1 0
0 1 1 1
0 1 0 1
1 1 1 0

If I bake this in voxel I get

□■■□ 
□■■■
□■□■
■■■□

Now consider the coordinates of the vertices they range from 0m to 4m. But I defined data from 0m to 3m

Now consider what happens if a split the chunk up (for the chunk tree for example)

Part A defined using data 0m-1m

0 1
0 1
0 1
1 1

Part B defined using data from 1m-2m

1 1
1 1
1 0
1 1 

Part C defined using data from 2m-3m

1 0
1 1
0 1
1 0

If we generate voxels in chunks then stitch them back together they are now 6m wide!

The marching cubes baker does not have this issues as its for loop is a half open range

0<=range<end

I could of course rework my input to not overlap like this. But it makes it incosistent with how other bakers like marching cubes and other ones I want to implement work. It also makes density difficult to work with.

In the case of density we are not limited to the grid points. It is perfectly fine to make a voxel at 0.99999999 depending on the chosen resolution. In this case we must use half open limits because reworking input to not overlap would stop it from being possible. e.g.

Part A without overlap

0
0
0
1 

Part B without overlap

1
1
1
1 

Without including both ends in the input there is only one value in the x present so interpolation at for example x=0.999999 is impossible.

I apologies if this is a confusing issue.

In short I think maybe the voxel baker should bake up to but not including the last grid point.

I could implement another voxel baker (e.g. blocky) that works more inline with this half open range present in other bakers. I just wanted your opinion on the matter.

QuantumEntangledAndy commented 3 years ago

I have a proposal to fix this:

Consider first how we are currently doing it (in 2d)

For this input

image

We render it as

image

If we split it into chunks this happens

image

We over render at the boundaries. This is the same problem where we render faces at the chunk edges where we shouldn't

What I propose instead is we work on grids of 4 points much like the marching cubes and other surfacing algorithms

image

This has the benefit of not rendering the boundary faces of the neighbouring chunks

image

It works like marching cubes where we observe 4 control points (8 in 3d) and mesh it based on the cases

image

It has the downside of increasing the vertex and face count

image


Pros

Cons

QuantumEntangledAndy commented 3 years ago

So I have a mostly working version of this baker. It's kinda nice because it's very similar to the marching cubes.

It plays well with density based data where the begining and end of the chunk are duplicated in neighbouring chunks too. But it stuggles with the gox format. Because the chunks in the gox format dosent duplicate the ends in their chunks.

I am seeing maybe two ways to solve this.

One keep this as a separate baker. That expects the end points of chunks to overlap. This will increase the complexity of future development as we will have to maintain multiple bakes for each type. Bakers with and without ends duplicated.

Two we adjust how the chunks are loaded on the gox format. I am thinking load a gox into a single chunk and allow the user to chunk it as they see fit. Or load the chunks with the duplicated border around them

QuantumEntangledAndy commented 3 years ago

Comments on this would be appreciated

QuantumEntangledAndy commented 3 years ago

Here's how marching cube and voxel looks like with the new baker I am working on:

Screenshot 2021-06-23 at 19 04 25

To get the atlas to work right on the marching cubes I am going to need to regenerate the marching cubes tables so that they include extra faces at where the atlas splits. Because one triangle on the marching cube can cross an atlas boundary.

norman784 commented 3 years ago

So I have a mostly working version of this baker. It's kinda nice because it's very similar to the marching cubes.

It plays well with density based data where the begining and end of the chunk are duplicated in neighbouring chunks too. But it stuggles with the gox format. Because the chunks in the gox format dosent duplicate the ends in their chunks.

I am seeing maybe two ways to solve this.

One keep this as a separate baker. That expects the end points of chunks to overlap. This will increase the complexity of future development as we will have to maintain multiple bakes for each type. Bakers with and without ends duplicated.

Two we adjust how the chunks are loaded on the gox format. I am thinking load a gox into a single chunk and allow the user to chunk it as they see fit. Or load the chunks with the duplicated border around them

Sorry for not answering earlier, I didn't have time yet to look at the materials you send me about densities (in other issue), so I didn't have anything to say on top of that.

But about the bakers/chunks, we should definitely populate the neighbor chunk data into each chunk, but I didn't found a solution that I liked. Becuse not having that data would make our bakers generating extra unnecessary vertices in the chunk boundaries (that is currently our case).

Pros dosent render boundary facing so no need to query neighbours makes chunking easier to implement works very similar to other surfacing algorithms Cons creates extra verticies and faces

I would say if the new implementation is not so slow comparing with the current one (you can run the benchmaks for both and see the difference in your machine), I think the pros weight more than the cons.

Here's how marching cube and voxel looks like with the new baker I am working on:

Screenshot 2021-06-23 at 19 04 25

The marching cubes is looking definitely better than the previous implementation.

To get the atlas to work right on the marching cubes I am going to need to regenerate the marching cubes tables so that they include extra faces at where the atlas splits. Because one triangle on the marching cube can cross an atlas boundary.

+1

QuantumEntangledAndy commented 3 years ago

I imagine it would be slower as I am doing lerps and other interesting maths so that we can have isovalues.

I think we should use the new method even if it is a bit slower on the benchmarks as this implementation allows for isovalues while the current one does not.

QuantumEntangledAndy commented 3 years ago

So the maths took me much longer than I would have liked, but i finally reworked the marching cubes tables so that we have extra triangles that split at the atlas boundaries. It looks like this now:

Screenshot 2021-06-29 at 08 52 21

I feel like maybe this baker should not be called marching cubes now as the tables are completly different now so perhaps I should add it as another baker called modified marching cubes.

QuantumEntangledAndy commented 3 years ago

I saw these odd green rings in the previous image at the boundaries. These were occurring because they were assigned to air atlases (0) which corresponded to tree green.

I fixed that by modifying the find nearest atlas code to find nearest non air atlas.

Screenshot 2021-06-29 at 10 24 39

But which atlas is ambiguous at the edge case causing this:

Screenshot 2021-06-29 at 10 21 32

I will keep poking at it

QuantumEntangledAndy commented 3 years ago

I am thinking we should leave it as the first case because then the user can define a definitive color by assigning an atlas to the neighboring air.

If I try to code something it will either make it all the upper or all the lower case and which case the user wants may differ.

QuantumEntangledAndy commented 3 years ago

This for example is what happens when we assign it to the cross diagonal:

Screenshot 2021-06-29 at 10 45 02

Some atlases are correct but in some cases it makes the wrong choice. For example at the tree building boundaries. The tree should press into the building not the other way around.

QuantumEntangledAndy commented 3 years ago

Ok so thinking about it more... What if I add extra faces at the ambigous case boundaries so that it forms neat lines... Ok that means extra polygons and more splitting of the faces, but what does it look like... A few quick changes to the face splitting algorthims to include the diagonal planes as well as the orthogonal x,y,z, planes and I get this:

Screenshot 2021-06-29 at 11 24 05

Not perfect but I am happy with it

QuantumEntangledAndy commented 3 years ago

Just for a nice comparison. Here's all three bakes:

Screenshot 2021-06-29 at 14 56 51
QuantumEntangledAndy commented 3 years ago

Ok so today I fixed up the tests and examples etc as well as ran some benches. I have a few ideas to speed things up but this is the first set of data.

Algorithm Master (s) New (s)
Marching cubes 0.11788 0.75364
Modified Marching cubes N/A 15.363
Voxel 0.17622 5.2526

The marching cubes one is surprising as its only adding UVs and normals and the UVs mostly come from a look up table.

Modified marching cubes is perhaps expected due to the number of extra verts and the coordinate system I used (baricentric)

Voxel is also surprising. There are not that many extra verts but this code is the least optimized so maybe this can be improved.

QuantumEntangledAndy commented 3 years ago

So I did a flame graph on the bench mark of the Voxel baker (so 100x baking of the terrain etc) and got this:

Screenshot 2021-06-30 at 13 20 13

It seems to be spending a lot of time in the octree insert code. This explains why the number of verts causes such a significant speed down. I might look into optimizing this or replacing it with another type of tree.

QuantumEntangledAndy commented 3 years ago

I made a MeshBuilder using an rstar tree crate and got this for the flame graph

Screenshot 2021-06-30 at 16 37 46

Remember when reading a flame graph that width represents % time spent in this code block. So as time spent in the tree decreases (gets thiner) the other parts of the code (like my part) gets thicker

norman784 commented 3 years ago

It looks interesting, did you check how the benchmark improved after this change?

QuantumEntangledAndy commented 3 years ago

It's running now

QuantumEntangledAndy commented 3 years ago
Algorithm OctTree (s) RStar (s)
Marching cubes 0.75364 0.90532
Modified Marching cubes 15.363 4.2191
Voxel 5.2526 3.8111

Seems to get better the more vertexes there are :)

QuantumEntangledAndy commented 3 years ago

The r* takes up 63% of the total time (octtree took 80%). This is still quite long. Should we consider NOT removing duplicate verts during bake. Instead add it as a method to Mesh (remove_duplicate_verts) so that the user can decide if they want to use the cycles on this. I imagine that most modern machines can handle lots of polygons and verts on their gpus during render.

QuantumEntangledAndy commented 3 years ago

We may get issues with rounding errors causing artifacts though... hmmmm.

QuantumEntangledAndy commented 3 years ago

Alternativly perhaps we could look into a GPU implementation. I'd like to move more of the voxel code to the gpu some day but that will require a lot of research and google time.

QuantumEntangledAndy commented 3 years ago

So I ran the bench mark without duplicate vert removal and got this:

Algorithm RStar (s) No Removal (s)
Marching cubes 0.90532 0.18027
Modified Marching cubes 4.2191 0.45264
Voxel 3.8111 0.38397

It is considerably faster!

With this for the render:

Screenshot 2021-07-01 at 14 04 46
QuantumEntangledAndy commented 3 years ago

So because I feel like the octree should not be that slow I tried optimizing it.

I got it down to 62% (r* was 63%) and the now comparison looks like this

Algorithm OctTree (s) RStar (s)
Marching cubes 0.64465 0.90532
Modified Marching cubes 2.9852 4.2191
Voxel 2.5123 3.8111
QuantumEntangledAndy commented 3 years ago

P.s. I also discovered that when running long time scale benchmarks we should use criterions Flat Sampling Mode this improves the accuracy of the measured time and reduces the time the benches takes.

QuantumEntangledAndy commented 3 years ago

Ok so I am about reading to write up the PRs. Since a lot changed here I want to know how you'd like them incorperated. One PR or a few smaller ones.

Features are:

norman784 commented 3 years ago

A few PR's as possible in this case, put everything that you think is ready for review.

QuantumEntangledAndy commented 3 years ago

Closing because of #48 being merged :)