BinomialLLC / basis_universal

Basis Universal GPU Texture Codec
Apache License 2.0
2.72k stars 267 forks source link

Severe transcoder performance issues #250

Closed kg closed 3 years ago

kg commented 3 years ago

For some reason the transcoder is incredibly slow when I host it in my own process. I've tried various things and it's 10-100x slower than asking the command-line encoder build to convert the .basis file back to a .png after transcoding it.

I've tried various build settings and it doesn't seem to make much of a difference, and pointing a profiler at it didn't show anything out of the ordinary beyond the transcoder just spending an enormous amount of time transcoding.

While the texture is big, the basis file is pretty small (~400k) and I still notice overhead for small images. Enabling/disabling mipmaps doesn't seem to have much of an impact. I've tried using both Dxt1 and Dxt3 as output formats (I'd try 5 but it's not easily available to me right now, and it didn't seem to be much faster when I tried just passing that output format once and not uploading it to the GPU)

Incidentally I also get random failures to transcode an image level that don't reoccur if I immediately retry, which seems like it could be related... hard for me to come up with an explanation for that one but I'll file a separate issue if I come up with more detail, just wanted to call it out in case it makes you go 'oh your problem is X'.

Compiler-wise I'm using VS2017's 64-bit compiler and have tried /O2 and /Ox, with and without pdb generation enabled, with and without link time code generation, and I'm static-linking the runtime library.

Looking at the generated code it seems like maybe it's just... bad? Is basis really only supported for clang and bleeding edge VC++?

image

kg commented 3 years ago

Hadn't looked at the profiler again recently, this does seem to suggest VC++ is the problem. image

kg commented 3 years ago

Wouldn't you know it, after intermittently messing with this all day it seems like the problem was just the STL not getting optimized. I'm still not sure why it was happening, but after messing with a bunch of additional msvc and build system flags, it seems like that went away... so it's just the basis code running now and it seems to be fine (albeit still 5-10x slower than basisu.exe, which I will blame on VS2017)

image

jiangzhhhh commented 2 years ago

on vc platform, stl it's run slowly in debug mode.

see: https://docs.microsoft.com/en-us/cpp/standard-library/checked-iterators?redirectedfrom=MSDN&view=msvc-170