Twinklebear / tobj

Tiny OBJ Loader in Rust
MIT License
233 stars 47 forks source link

Do not duplicate vertices/mirror structure of original OBJ #34

Closed virtualritz closed 3 years ago

virtualritz commented 3 years ago

tobj blows up the number of vertices when an imported model has per-vertex-per-face normals. There may be other cases (I e.g. would assume the same happens with per-vertex-per-face UVs/vertex colors).

In offline renderers like 3Delight, Arnold or RenderMan, duplicated vertices make the geometry discontinuous and a mesh will crack wide open at face boundaries when displacement shaders are attached to.

A mesh tagged as a render-time subdivision surface won't even render correct if the vertices are not shared between faces in the index.

o Tetrahedron
v  1  1  1
v  1 -1 -1
v -1  1 -1
v -1 -1  1
vn  0.577  0.577 -0.577
vn -0.577  0.577  0.577
vn  0.577 -0.577  0.577
vn  0.577  0.577 -0.577
vn  0.577 -0.577  0.577
vn -0.577 -0.577 -0.577
vn  0.577  0.577 -0.577
vn -0.577  0.577  0.577
vn -0.577 -0.577 -0.577
vn -0.577  0.577  0.577
vn  0.577 -0.577  0.577
vn -0.577 -0.577 -0.577
s 1
f 1//3 4//11 2//5
s 2
f 2//4 3//7 1//1
s 3
f 2//6 4//12 3//9
s 4
f 3//8 4//10 1//2

For this example, I expect the resulting Mesh's positions to have 12 f32 entries (aka 4 vertices). The normals should have 36 entries (aka 12 normals).

However, to not require a separate index for the normals, the normals (or texture coordinates/vertex colors) need to be reordered, if they are per-vertex-per-face. I.e. their index should just be monotonically increasing.

In the case above, the index in the file needs to be re-ordered, on import so that it would it look like this:

s 1
f 1//1 4//2 2//3
s 2
f 2//4 3//5 1//6
s 3
f 2//7 4//8 3//9
s 4
f 3//10 4//11 1//12
virtualritz commented 3 years ago

I forgot: I do understand that the current behavior is convenient for users targeting real time display of the geometry. So this should probably be an option.

Twinklebear commented 3 years ago

This seems like kind of the route that tinyobjloader has gone now https://github.com/tinyobjloader/tinyobjloader#data-format , returning separate indices for position, normal and texcoord instead of duplicating things to make unique vertices with a single index. Adding this mode as another import mode would be nice to support both use cases, and adding the single-index vertex creation on top of the loaded data with 3 indices isn't bad to do, that's what my C++ projects using tinyobjloader do now.

One thing that's not clear to me is how the reordered mesh lets you avoid having a separate index for the normals:

s 1
f 1//1 4//2 2//3
s 2
f 2//4 3//5 1//6
s 3
f 2//7 4//8 3//9
s 4
f 3//10 4//11 1//12

I can see how reordering the vn entries can let you make the indices monotonically increasing here, but the indexing is still different from the position index, so it seems like it would still need separate position and normal indices if you didn't want to duplicate positions?

virtualritz commented 3 years ago

The format you send this data to the offline renderer is usually an array that has the number of vertices for each face.

3, 4, 3, 5, 5, ...

A triangle, a quad, a triangle and two 5-gons. And then, for each attribute (position is not special but obviously mandatory), the actual data and a separate index.

So an attribute that is not sparse (as is the normal above, there are twelve normals for twelve per face vertices) could have an index like this (square brackets indicate faces):

[ 0 1 2 ]  [ 3 4 5 6 ]  [ 7 8 9 ]  [ 10 11 12 13 14 ]  [ 15 16 17 18 19 ]

And in this case only the index can be omitted.

Ofc, if the attribute is sparse, you can't do this or you need to blow up data again. But this is also interesting because there may be a tradeoff.

Imagine the original example would have only one normal shared by the three adjacent faces. So it would have 10 entries since in a tetrahedron all vertices have valence three (one face with three normals, three faces with two normals each plus one shared normal by those three faces).

10 3 32 = 960 bytes of f32 normal data + 12 * 32 = 344 bytes of i32 index data = 1344 bytes for the normal attribute.

So if we blew up the normal array to be non sparse and reordered it, as above:

12 3 32 = 1152 bytes of f32 data in total for the normal attribute. I.e. less than the sparse one + index.

I.e. it depends on 'how sparse' data is whether blowing it up may actually be preferred. I would think that in most cases it wouldn't but it's just a gut feeling.

It would ofc. be super cool if there was a mode of operation where the lib could detect that and offer the most compact representation automatically.

virtualritz commented 3 years ago

Any feedback on this/ETA? I need this soon.

I may have a stab at it myself but I will probably need some hand holding, where/how to start.

Twinklebear commented 3 years ago

Sorry no ETA, this slipped off my todo list a bit. Your explanation makes sense though so I see what you mean now in reordering/expanding the data to remove the need for indices to reference the normals.

If you have time/interest that'd be great, and I can definitely give some pointers and notes as to where to get started. I have some follow up questions though based on that and the original note in the issue about if the positions don't reference the same position index the mesh will come apart:

virtualritz commented 3 years ago

Oh, that was quick. So I am already working on this. I should hopefully have a naive implementation in an hour or so. What I did was:

pub struct Mesh {
    pub positions: Vec<f32>,
    pub normals: Vec<f32>,
    pub texcoords: Vec<f32>,

    pub indices: Vec<u32>,
    pub normal_indices: Option<Vec<u32>>,
    pub texcoord_indices: Option<Vec<u32>>,

    pub num_face_indices: Vec<u32>,

    pub material_id: Option<usize>,
}

Once I have this working I get back to you and eventually look into reordering (although I do not need this, at the beginning).

virtualritz commented 3 years ago

Have a look at my multi_index branch.

What I did is to add a new code path for that. It is selected by create_multi_index – another boolean passed to the loader functions.

When the the normals or UVs is/are sparse, the previous last index value gets duplicated. This is kinda dumb. But I reckon sparse UV/normal specifications are rare enough in the wild to leave that as-is – until someone complains.

When the discard_trivial_indices feature is enabled (requires nightly), the normal and uv indices get checked for being trivial. I.e. if they are just monotonic 0...n with step 1. And if so, they get discarded (set to None). That is the 'reordering' I was talking about which kinda happens automagically in your own code already. All that is needed is to check the resp. indices, afterwards, with is_sorted_by() and then possibly discard.

I.e.: if the mesh has texture coordinates/normals and lacks the resp. indices then the indices were [0, 1, 2, ... n] with n = number of vertices (value per vertex) or sum(face arities) (value per vertex per face) and hence got discarded.

P.S: I also added a feature, ahash to use ahash::AHashMap/ahash::AHasSet over the std versions as it is supposed to be a lot faster. But not sure if this has any effect in a simple case as here. Needs benches.

virtualritz commented 3 years ago

I also updated the docs and the tests now. I am using this in a project with opensubdiv-petite at the moment and it seems to behave as expected. I will test a bit more and open a PR if that's ok with you.

Twinklebear commented 3 years ago

Sounds good! I haven't had a chance to go through your branch yet but once you've got things looking good I'm happy to take a look at a PR. What you mention in #36 (expanding lines to degenerate triangles when triangulate is on) also sounds good to include here if you've got that on the fork too

virtualritz commented 3 years ago

What you mention in #36 (expanding lines to degenerate triangles when triangulate is on) also sounds good to include here if you've got that on the fork too

Yes, that is on the fork. The load_obj() function has updated docs re. this. In general, have a look at the docs. I added some stuff and did a tad of polishing.

Twinklebear commented 3 years ago

I took a look through and it's looking good, feel free to open a PR when you're ready, then it'll be a bit easier to put some specific comments/review notes.

One question though after looking at it, is the multi index layout more like tinyobjloader's multi-index data format: https://github.com/tinyobjloader/tinyobjloader#data-format ? Where each vertex has a separate position/normal/texcoord index? At a short glance it seems a bit more like this + this expansion instead of a single indexed sort of thing like you were mentioning before?

One thing I was thinking, is that we could actually have the default load path be this kind of tinyobjloader/multi-indexed data layout, which is more a direct mapping of the obj file, and then have optional "post-load mesh processors" that would produce single indexed format (like what I have now), or this multi index format you're adding.

virtualritz commented 3 years ago

Where each vertex has a separate position/normal/texcoord index?

I had since changed the code so the indices are not Options any more. Instead of checking for None one would check for is_empty(). I found the Option generating unnecessary extra code. I can give some concrete example, if it's not obvious.

Each index is a separate array. This suits how one would typically use that data downstream. E.g. OpenSubdiv or the aforementioned offline renderers. Specifically, you can use the bytemuck crate to cast indices from u32 to i32 for my nsi crate. And opensubdiv-petite will eat it u32 indices as-are.

It looks like tinyobjloader uses a single index where each entry then has the values for the resp. attributes. This is cumbersome for the use cases I am aware of and also wastes space if an attribute is absent – unless I miss something.

In my current implementation they are just empty if the data is absent or not needed (face_arities in case of the triangulate_faces flag set so far – there is no triangle only detection which could be added).

One thing I was thinking, is that we could actually have the default load path be this kind of tinyobjloader/multi-indexed data layout, which is more a direct mapping of the obj file, and then have optional "post-load mesh processors" that would produce single indexed format (like what I have now), or this multi index format you're adding.

I would think most users of tobj are from the realtime crowd for now. So while it is more true for now it may add computations to their use-cases that the current API nicely avoids – by just adding two bool flags.

Post processing is nice though. I e.g. have code that takes the normal data and generates creases & corners from that for OpenSubdiv. But that requires another two indices.

Nice-to-have post processors are:

While one could argue this is the job of other crates I find that I kinda expect this from any mesh loader. Particularly the last feature.

virtualritz commented 3 years ago

I added reordering support in my fork. Needs testing.

That branch also has the flags refactored into a LoadOptions struct that uses the init struct pattern for convenience.

virtualritz commented 3 years ago

Do you think we can get any of this merged in soonish? The reason is that another rust gamedev newsletter is coming up.

I am writing a brief section on my opensubdiv-petite wrapper. One thing meshes need, to be useful as input for this lib, is to have connected topology.

Which makes this issue and #41 relevant additions to tobj for people that load OBJs to be used with that crate. And I would like to mention this. 😉

Other crates, e.g. stl_io have this position merging support also built in btw.

virtualritz commented 3 years ago

In my current implementation they are just empty if the data is absent or not needed (face_arities in case of the triangulate_faces flag set so far – there is no triangle only detection which could be added).

I added triangle mesh detection in my cleanup branch. Basically, face_arties.is_empty() == true guarantees the mesh has only triangles.

This branch also has two options to filter out points and lines completely. Again, this is informed by requirements of using the meshes with e.g. opensubdiv-petite or if one wants to do some topological operations on them (polyhedron-ops).

Twinklebear commented 3 years ago

Yeah as far as merging, I've looked through the fork some and I think this is looking pretty nice. The external API changes are not big but adds a lot of nice features, and the expanded documentation is great.

If you've got it ready (I saw you closed the other PR), you can open a PR to just bring it all in and I can make some time to take a look.

Twinklebear commented 3 years ago

This is part of the PR and now in 3.0.0 correct? So we can close this