Open eddyb opened 3 years ago
One way we could avoid the limitations around Rust not having custom layouts, is requiring #[repr(C, packed)]
- that way, there is no interior padding between fields, and if needed we could force the user to insert explicit unused fields to make it compatible with e.g. std140
/std430
/etc.
EDIT: I should note that #[repr(C, packed(4))]
works and unlike plain packed
, allows taking references to fields (for fields of a type that itself requires an alignment of 4
or less). I should probably look into all of the possible SPIR-V layout rules to find which ones that matches up to.
usize should also be banned in structs - easy mistake to make and end up with 64 bits on the host, and 32 bits in the shader.
We could probably use the JSON output we have nowadays to reflect all types used in the interface, so that we can either generate appropriate Rust types, or validate (via compile-time assert!
s) that the definitions used on the host end up with the same field offsets, compatible field types, etc.
@eddyb oh that sounds like an interesting idea, really do want some static validation / enforcement of these type of layout rules both for shader types and also we sees similar problems in our native and WASM code, so easy to miss a repr(C)
and other things. @termhn been exploring this in native land and with bytemuck
, our own experiments and such crates also.
could have specific solutions for the GPU shaders but feels like a larger shared problem here also
Yeah, generating or validating Rust structs would be really cool. Indeed easy to miss a #repr(C) and spend a whole day debugging why the data isn't getting to the shader right (ask me how I know...)
Wish repr(C)
automatically provided a marker trait to the type so generic code could require it. Well that and a repr(C)
that required recursively that all fields and their types also was repr(C)
.
Not sure if anything like that could be feasible?
I just ran into this problem with glam::Vec3
. I didn't realize that its layout is 16bytes on GPU, but 12 on CPU.
I have found a solution crevice
example: common/src/lib.rs
#![cfg_attr(target_arch = "spirv", no_std)]
use spirv_std::glam as glam;
use glam::{Vec3, UVec2, Vec4};
#[repr(C)]
#[derive(Copy, Clone)]
#[cfg_attr(not(target_arch = "spirv"), derive(Debug, crevice::std430::AsStd430))]
pub struct Input {
pub camera_pos: Vec3,
pub camera_quat: Vec4, // no Quat :(
pub camera_fov: f32,
pub screen_size: UVec2,
pub time: f32,
}
common/Cargo.toml
[target.'cfg(not(target_arch = "spirv"))'.dependencies]
crevice = { version = "0.14", features = ["glam"] }
crevice implements as_std[430,140] with give structs with bytemuck pod and as_bytes
crevice implements as_std[430,140] with give structs with bytemuck pod and as_bytes
Yes, some projects use such derives, but this is offtopic on this issue (which is about making it hard/impossible for the same definition to be interpreted differently) and there's no guarantee that the GPU-side layout of any type matches that crate you are using (in fact, you would want FromStd430
, if it exists, GPU-side, but it likely does direct memory manipulation not yet supported in Rust-GPU).
On the Discord we did find an interesting way to use crevice
: instead of only using it on the CPU side, and just hoping Rust-GPU matches, you could do this instead:
use crevice::std430::{AsStd430, Vec3};
#[derive(AsStd430)]
pub struct __Light_HighLevel {
position: Vec3,
intensity: f32,
color: Vec3,
}
type Light = <__Light_HighLevel as AsStd430>::Output;
pub fn shader(lights: &[Light]) {
for i in 0..lights.len() {
let Light { position, intensity, color, .. } = lights[i];
let _x = position.x + intensity + color.x;
}
}
(Vec3
in particular tends to be annoying to support cleanly, but this way you could rely on crevice
generating a repr(C)
type that should have the same layout, on all targets)
What if rust-gpu
used serde
's types & macro as it's interface for specifying "this needs to be the same across CPU and GPU" and use the method/format an implementation detail? The compiler would then know the exact way the type serializes to the serde/rust datamodel and can under the hood use something likecrevice
as a serde output format. Or you could introduce a #[derive(SendGpu)]
and #[derive(SendCpu)
which would act like the serde traits under the hood if you want less magic.
@LegNeato that sounds a lot like what crabslab does, though crabslab stores and reads that data from an explicit buffer of &[u32]
.
Yeah, I guess there are two target users here: those that want to control how things are padded / sent to the GPU explicitly, and those that just want it sent "over the bridge" and don't care how.
Sadly, it's hard/impossible to account for
#[cfg]
s in e.g.glam
, and vector types in general tend to have weirder alignment rules than scalars (and aggregates thereof).But we should at the very least require
#[repr(C)]
, and maybe even disallow any interior padding or wildly varying alignments (as that would imply potentially different alignments on the host).I almost wish Rust had explicit field offsets with a per-field attribute - we could require that to guarantee that the host would see the same layout and it would let us have the weirder layouts mentioned in #11.