guillaumeblanc / ozz-animation

Open source c++ skeletal animation library and toolset
http://guillaumeblanc.github.io/ozz-animation/
Other
2.42k stars 300 forks source link

Decoupling SamplingJob from file format #163

Closed sherief closed 7 months ago

sherief commented 1 year ago

I'm using ozz for a runtime and I'm running into a performance issue. With SkinningJob, the data provided is a bunch of spans + strides, meaning the file format I use is irrelevant. I started out with the sample format but perf analysis showed that the deserialization times are high. That wasn't a problem, I just packed my data in my own format and was able to just memcpy() if from a trusted source, and even decompress-on-read on current gen consoles. The difference versus the sample mesh format was ~1000x.

Now I'm looking at the animation format and seeing the same issues - nothing wrong with the format for samples, as error checking and versioning are important, but for runtime and especially on consoles when loading from a trusted source, the load times for animations are very high. Some cutscene animations call the stream read function 10,000 to 80,000 times for one animation, and all these are 5 seconds or less. I'd like to do the same thing I did for the mesh format, lay out the data in a more compression / read friendly way, but the sampling job and its context are tied to the Animation class which can either be deserialized (slowly) or built using AnimationBuilder - neither option meets the perf requirements for runtime. Right now the title loading times are dominated (>70%) by ozz animation loading CPU time.

Would you be open to making the sampling more runtime perf friendly? Ideally, I'd like for the SamplingJob to receive its data as a set of spans like SkinningJob, and would like the SamplingJob::Context to allow me to provide it with data buffers to use for memory storage of current / outdated soa_transforms - the latter is so I can use a pre-allocated pool and avoid making a call to the allocator entirely, which matters a lot for many core scenarios. I'd be open to contributing to this, of course.

guillaumeblanc commented 7 months ago

Hi,

Thanks for reporting the issue. Very interesting !

With SkinningJob, the data provided is a bunch of spans + strides, meaning the file format I use is irrelevant.

Indeed that's the purpose of using spans for the skinning job. ozz doesn't fix the mesh format, as this is out of the library scope and difficult to make it engine agnostic.

Some cutscene animations call the stream read function 10,000 to 80,000 times for one animation

I was not expecting such an impact on animation reads. The changes to the animation formation done for the release 0.15.0 refactors serialization to use arrays. This significantly reduces the number of reads. Loading an animation with this version take tens of µs on an old laptop. Could you have a try ?

Also, just to be sure: are you using ozz::io::Stream provided by ozz, or have you implemented yours?

I'd like for the SamplingJob to receive its data as a set of spans like SkinningJob

It's impossible, constructing a runtime animation is complex, as you can see from the animation builder implementation, and not part of the public API. The public API from user perspective is that ozz will build a valid animation from a valid raw animation (which is easy to build).

and would like the SamplingJob::Context to allow me to provide it with data buffers

That's a good idea. Don't hesitate to provide a PR for this. Also, have you considered recycling contexts?

Cheers, Guillaume

guillaumeblanc commented 7 months ago

I consider this issue fixed with version 0.15.0 optimizations.

Don't hesitate to reopen if needed.

Cheers, Guillaume