GothicKit / ZenKit

A re-implementation of file formats used by the early 2000's ZenGin
http://zk.gothickit.dev/
MIT License
47 stars 10 forks source link

Performance consern regardless `zenkit::Polygon` #88

Closed Try closed 4 months ago

Try commented 7 months ago

I've noticed a performance regress on load-time after merging new version of phoenix(zenkit). While haven't done a real profiling yet, pausing application in debug almost always shows this callstack: изображение For now - just raising awareness.

And while at it, can you remind me why zenkit::Polygon is a thing? Asking about it, because currently I'm looking into more efficient mesh packing, and one of promising direction is instantiated list of 62 triangle strips - basically looking into what are the options are.

lmichaelis commented 7 months ago

Hio, I added the Polygon for compatibility with the BSP-Tree. Projects like GothicVR want to use it for performance improvements themselves. Yes, I know that it's a bit slow with all the allocations happening so I do mean to write my own allocator at some point to speed it up (or to just set an upper limit on vertices per polygon).

Try commented 7 months ago

write my own allocator

Alternative might be to put features/vertices in a contiguous array of data and reference it with offsets in Polygon or with std::span

lmichaelis commented 4 months ago

@Try I've improved performance by ~25% on my machine. Please try it out the patch and let me know if this is to your liking :)

Try commented 4 months ago

Testing, time that World::World(...) takes

Baseline:

// debug
newworld.zen load time: 47.480000
newworld.zen load time: 47.005001
newworld.zen load time: 47.536999

oldworld.zen load time: 11.428000
oldworld.zen load time: 13.113000
oldworld.zen load time: 12.029000

// rel-debug
newworld.zen load time: 52.644001
newworld.zen load time: 52.631001
newworld.zen load time: 52.368999

oldworld.zen load time: 12.678000
oldworld.zen load time: 12.822000
oldworld.zen load time: 12.899000

With new changes:

// debug
newworld.zen load time: 12.190000
newworld.zen load time: 12.774000
newworld.zen load time: 16.729000

oldworld.zen load time: 4.434000
oldworld.zen load time: 4.440000
oldworld.zen load time: 4.618000

// rel-debug
newworld.zen load time: 16.590000
newworld.zen load time: 13.988000
newworld.zen load time: 14.205000

oldworld.zen load time: 7.482000
oldworld.zen load time: 7.746000
oldworld.zen load time: 7.606000

Aside from mingw curiosities, when release build runs slower than debug, can say that it pretty substantial improvement. More than 3x - great :)

lmichaelis commented 4 months ago

Out of curiosity: what compiler version are you using? It seems to produce very low performance code :/

Try commented 4 months ago

MinGW 11.2.0 - the one that comes, as the default, in package with QtCreator on windows. The code probably is not important in this case. Most likely, c++ runtime, that comes with mingw has non optimized allocator for windows