Open ChrisTechs opened 2 weeks ago
So, turns out you can't do that unfortunately. You need the extra corner blocks to correctly calculate the AO value ;/
Scratch that, it seems this implementation doesn't actually use this data for AO yet. In any case, it should since that data is already available :p
(Also the separate loops from before will run much faster because of that memory being accessed in sequence, meaning there is a high chance the cpu still has the data cached)
~So, turns out you can't do that unfortunately. You need the extra corner blocks to correctly calculate the AO value ;/~ Scratch that, it seems this implementation doesn't actually use this data for AO yet. In any case, it should since that data is already available :p
(Also the separate loops from before will run much faster because of that memory being accessed in sequence, meaning there is a high chance the cpu still has the data cached)
Yes it should use the extra corner blocks to correctly calculate AO values
After testing I did find getting neighbour data using one loop is faster on average by 1 microsecond on my machine but I found a better solution:
// Process x-axis boundaries for y in 0..CHUNK_SIZE_P { for z in 0..CHUNK_SIZE_P { let nx = ivec3(-1, y as i32, z as i32); let px = ivec3(CHUNK_SIZE_P as i32, y as i32, z as i32); add_voxel_to_axis_cols(chunks_refs.get_block(nx), 0, y, z, &mut axis_cols); add_voxel_to_axis_cols(chunks_refs.get_block(px), CHUNK_SIZE_P - 1, y, z, &mut axis_cols); } }
and then do that for every axis.
I found this to be faster by 2 microseconds on average on my machine.
Ah yeah, nice update! Though, now you might realize that; by processing each plane individually, you actually end up processing the corner blocks multiple times.
What I did after I realized that, was separate the loops a bit further, and then you end up with the implementation I have in my MR (#6).
Notice how I separate the loops, I found that if I focus the loops on removing the z access jumps, the code would run much faster.
Ah yeah, nice update! Though, now you might realize that; by processing each plane individually, you actually end up processing the corner blocks multiple times.
What I did after I realized that, was separate the loops a bit further, and then you end up with the implementation I have in my MR (#6).
Notice how I separate the loops, I found that if I focus the loops on removing the z access jumps, the code would run much faster.
yeah but it should be quite easy to skip corners I can just change the start and end range of certain loops
Only CHUNK_SIZE x CHUNK_SIZE amount of blocks per side of a chunk have neighbours so only CHUNK_SIZE x CHUNK_SIZE amount of blocks are added as padding per chunk side instead of CHUNK_SIZE_P x CHUNK_SIZE_P blocks.
Method to add neighbours are moved into a single nested for loop since the chunk is assumed to be a cube therefore every side of a chunk have the same amount of blocks