Unpack indices of arbitrary component types

shawnhatori commented 11 months ago

Since the memcpy only needs the size of the index data, it can use the component size (when equal to the stride) to determine the total number of bytes to copy. This allows it to work for different component types, including the common cgltf_component_type_r_16u.

This also adds a reference comment for the function at the top of the file, consistent with cgltf_accessor_unpack_floats.

zeux commented 11 months ago

This doesn't seem right. When the input data is floats, we will now memcpy floats into 32-bit unsigned int output which is not what the caller expects. When the input data is r16u, we will now memcpy 16-bit indices into 32-bit unsigned int output which is also not what the caller expects. This function gives the caller a 32-bit unsigned index buffer while handling format conversions; the memcpy is an optimization for the case where no conversion is required.

shawnhatori commented 11 months ago

You're right, the change as written was not clear. I'll explain the intention.

What the API currently does: Requires index data to be stored in a uint32_t array. If the component size is smaller than uint32_t, each element is padded to 32-bit boundaries (e.g. for r16u, 2, 1, 0, 3 -> 0x02000000 0x01000000 0x00000000 0x03000000).

What I was expecting: Takes in a void* for an index array and an expected component_size (i.e. the data type of the index array). If the component_size of the accessor is larger than expected, returns 0. If it is smaller than expected, pad to expected (this replicates the existing padding behavior but for any larger type). Else, they're equal so memcopy everything (fast path) (e.g. for r16u, 2, 1, 0, 3 -> 0x0200 0x0100 0x0000 0x0300).

In my use case (and I assume many others), I own and know my mesh data, which all have 16-bit index values so my indices are stored as an array of uint16_t. My graphics API backends are also expecting index data packed as R16U. So I was hoping to have an API function to handle the unpacking boilerplate that respects the data type of the index array.

I'll leave it up to the maintainers to determine if that's a use case worth supporting, and if so, how to without a breaking API change. For the sake of discussion, I'll add a temporary new function that aims to handle this case as described above.

shawnhatori commented 9 months ago

@zeux Does the most recent code change address the data type issues you identified in your previous comment? I believe it's correct now.

For what it's worth, I've been using this change in my own asset pipeline to support my 16-bit index use case and it has worked great. I did test in a debugger using 32-bit indices and the data layout appears correct (i.e. no breaking change).

If this is a use case worth supporting, the interface obviously needs tweaks. This could possibly be an _ex function.

zeux commented 9 months ago

Yeah I think this addresses my concerns; I tested this in gltfpack which uses 32-bit indices and the change works without issues. I think we would need to either add this argument as mandatory to unpack_indices, or call the new function something else. Because unpack_indices was added after the last numbered release, I would recommend just adding a required argument to the function - all callers should be trivial to adjust.

There's still one corner case that I mentioned that isn't handled the same way with this change: namely, when the input is floats, memcpy will reinterpret the bits if the destination is unsigned int. If the component type is passed instead of a component size, the function can do memcpy only when the accessor type matches the expected type. You will note that cgltf_component_read_index does support floats as the input.

However, thinking about this and looking at the history, I am not sure why we're supporting reading floats as indices in the first place. This was initially introduced all the way back in https://github.com/jkuhlmann/cgltf/commit/1f7712844caaffb6d9f42a04178337de3615bd3d and then gradually refactored into the current state. I would expect read_index to be used when glTF spec disallows floating point values, and the only example OTOH that's even theoretically plausible is feature IDs in upcoming EXT_mesh_features, that I would reasonably expect to read as floats.

Maybe @prideout has context here -- is reading floats-as-indices via cgltf_accessor_read_index used in Filament?

prideout commented 9 months ago

Maybe @prideout has context here -- is reading floats-as-indices via cgltf_accessor_read_index used in Filament?

No. If we were to remove that particular case from the switch, I don't think we would adversely affect Filament.

As I recall, I included that case for completeness, since this function was originally conceived as a general purpose conversion utility, which is consistent with a strict interpretation of its docblock at the top of the file.

Note: I am no longer affiliated with Filament or Google.

zeux commented 9 months ago

Great, thanks! I submitted https://github.com/jkuhlmann/cgltf/pull/232 to remove these to avoid future confusion and ensure that unpack_indices never encounters valid scenarios where it could decode float -> int.

jkuhlmann / cgltf

Unpack indices of arbitrary component types #227