wrnrlr / g3

Neat 3D math and graphics library
ISC License
32 stars 2 forks source link

Suggestion, change order representation to match common GPU expectations #5

Closed Makogan closed 1 year ago

Makogan commented 1 year ago

This is a simple suggestion that may clash with other requirements. It's merely in case this is a possible change.

The definition of point in the library is as follows:

pub struct Point (pub(crate) f32x4);

impl Point {
  #[inline] pub fn w(&self)->f32 { self.0[0] }
  #[inline] pub fn e123(&self)->f32 { self.w() }
  #[inline] pub fn x(&self)->f32 { self.0[1] }
  #[inline] pub fn e032(&self)->f32 { self.x() }
  #[inline] pub fn y(&self)->f32 { self.0[2] }
  #[inline] pub fn e013(&self)->f32 { self.y() }
  #[inline] pub fn z(&self)->f32 { self.0[3] }
  #[inline] pub fn e021(&self)->f32 { self.z() }

This makes perfect sense since multivectors are commonly written with the scalar term on the left most position.

However most hardware designed for geometry, primarily GPUs actually fllow the convention xyzw.

So if one serializes the points as is and send them to a shader, the results will be incorrect. w becomes x, x becomes y and so on, thus you have to write this:

  #version 450
#extension GL_EXT_scalar_block_layout : enable

layout(location = 0) in vec4 inPosition;

layout(binding = 0) uniform MVPOnlyUbo
{
    mat4 model;
    mat4 view;
    mat4 proj;
};

void main() {
    gl_Position = proj * view * model * vec4(inPosition.yzw, 1.0);
}

Note the strange recasting inPosition.yzw, this is necessary since the layout the GPU expects and the layout in the library are different. In order to support fast and efficient byte compatibility I would like to suggest moving the w term at the end of the array.

Alternatively the compact point suggested in https://github.com/wrnrlr/g3/issues/4 could also solve this if I get around to implementing it.

wrnrlr commented 1 year ago

Yes I agree this is a shortcoming in the design that is a result of g3 starting off as a port of klein.

This is also the reason for setting the visibility of the Point members to pub(crate) so this can be changed in the future without breaking anything else.

BTW, I think that most often it is better todo this shuffle on the CPU especially if you are using an index buffer. I've added some new methods to help with this.

It will be a lot of work™ to implement this change just to save one more shuffle. In the future I hope to be able to implement some of the internals with a macro and avoid having to do all this by hand.

This change might also make it possible to work nicely with rust-gpu.