TanTanDev / binary_greedy_mesher_demo

Other
181 stars 26 forks source link

Improved getting neighbor chunk voxels #7

Open ChrisTechs opened 2 weeks ago

ChrisTechs commented 2 weeks ago

Only CHUNK_SIZE x CHUNK_SIZE amount of blocks per side of a chunk have neighbours so only CHUNK_SIZE x CHUNK_SIZE amount of blocks are added as padding per chunk side instead of CHUNK_SIZE_P x CHUNK_SIZE_P blocks.

Method to add neighbours are moved into a single nested for loop since the chunk is assumed to be a cube therefore every side of a chunk have the same amount of blocks

duckdoom5 commented 2 weeks ago

So, turns out you can't do that unfortunately. You need the extra corner blocks to correctly calculate the AO value ;/ Scratch that, it seems this implementation doesn't actually use this data for AO yet. In any case, it should since that data is already available :p

(Also the separate loops from before will run much faster because of that memory being accessed in sequence, meaning there is a high chance the cpu still has the data cached)

ChrisTechs commented 1 week ago

~So, turns out you can't do that unfortunately. You need the extra corner blocks to correctly calculate the AO value ;/~ Scratch that, it seems this implementation doesn't actually use this data for AO yet. In any case, it should since that data is already available :p

(Also the separate loops from before will run much faster because of that memory being accessed in sequence, meaning there is a high chance the cpu still has the data cached)

Yes it should use the extra corner blocks to correctly calculate AO values

After testing I did find getting neighbour data using one loop is faster on average by 1 microsecond on my machine but I found a better solution: // Process x-axis boundaries for y in 0..CHUNK_SIZE_P { for z in 0..CHUNK_SIZE_P { let nx = ivec3(-1, y as i32, z as i32); let px = ivec3(CHUNK_SIZE_P as i32, y as i32, z as i32); add_voxel_to_axis_cols(chunks_refs.get_block(nx), 0, y, z, &mut axis_cols); add_voxel_to_axis_cols(chunks_refs.get_block(px), CHUNK_SIZE_P - 1, y, z, &mut axis_cols); } } and then do that for every axis. I found this to be faster by 2 microseconds on average on my machine.

duckdoom5 commented 1 week ago

Ah yeah, nice update! Though, now you might realize that; by processing each plane individually, you actually end up processing the corner blocks multiple times.

What I did after I realized that, was separate the loops a bit further, and then you end up with the implementation I have in my MR (#6).

Notice how I separate the loops, I found that if I focus the loops on removing the z access jumps, the code would run much faster.

ChrisTechs commented 1 week ago

Ah yeah, nice update! Though, now you might realize that; by processing each plane individually, you actually end up processing the corner blocks multiple times.

What I did after I realized that, was separate the loops a bit further, and then you end up with the implementation I have in my MR (#6).

Notice how I separate the loops, I found that if I focus the loops on removing the z access jumps, the code would run much faster.

yeah but it should be quite easy to skip corners I can just change the start and end range of certain loops