donmccurdy / glTF-Transform

glTF 2.0 SDK for JavaScript and TypeScript, on Web and Node.js.
https://gltf-transform.dev
MIT License
1.32k stars 145 forks source link

Support compressing KTX2 textures without CLI #675

Open hybridherbst opened 1 year ago

hybridherbst commented 1 year ago

Is your feature request related to a problem? Please describe. I'm in the process of updating from 2.1.7 to 2.4.2 (due to the toktx command changes), but running into what looks like "incorrect but somehow working" use of the library.

In 2.1.7, we were using something like

import { NodeIO, PropertyType } from '@gltf-transform/core';
import { draco, toktx } from "@gltf-transform/cli";
...
await document.transform(
        // see https://github.com/donmccurdy/glTF-Transform/blob/main/packages/cli/src/transforms/toktx.ts
        toktx({ mode: Mode.ETC1S }),
        draco(),
);

It looks like in 2.4.2 draco has moved into /functions, but otherwise this still works.

How can we use the simplicity of the toktx command but not depend on the CLI? Might be missing something obvious here!

donmccurdy commented 1 year ago

I've been hoping that @squoosh/lib might add Basis Universal support, allowing me to move KTX2 compression out of the CLI environment, to support in-browser compression, and to not rely on KTX-Software. I think as long as it depends on the toktx CLI command it is probably best to keep it in the /cli package, I'd rather not have the /functions package spawning shell processes. Unfortunately progress in squoosh appears to have stalled (https://github.com/GoogleChromeLabs/squoosh/pull/1017).

There's also https://github.com/BinomialLLC/basis_universal, they've added a WASM encoder for KTX2 recently. I haven't looked at that closely enough yet to say whether it works in Web, Node.js, or both, and whether it's generally a good replacement.

I think some more investigation would be needed here... in the meantime it unfortunately would be necessary to copy/paste the toktx() function to avoid the /cli dependency, and even then it requires access to a shell and the KTX-Software toktx CLI program.

hybridherbst commented 1 year ago

Makes sense! I wasn't aware of this chain of issues, thanks for clarifying.

(Side note: is there a better place for reading about breaking changes than the Changelog? Just ran into classes extending ExtensionProperty now requiring an init() implementation (a change from 2.1.7 to 2.4.2, not sure when) but haven't seen that mentioned somewhere.)

donmccurdy commented 1 year ago

I believe the only relevant changes here were...

... included in v2.0.0 and mentioned in the changelog for that release ... is it possible you had something pinned older than v2.1.7? I'm not sure how a 2.1.7 to 2.4.2 update would break an extension. Sorry for the trouble if so! 😕

donmccurdy commented 1 year ago

@hybridherbst were you able to get the v2.4.2 update working?

hybridherbst commented 1 year ago

Yep, we were. We also ended up doing what you recommended and moving the toktx scripts out of the CLI package into our own, to get rid of the CLI dependency.

Samsy commented 1 year ago

@hybridherbst is this available somewhere ?

Samsy commented 1 year ago

Ups sorry I misunderstood, I thought there were a webassembly / wasm implementation of the KTX encoder that was hacked

gabrieljbaker commented 1 year ago

@hybridherbst , we are in the same pickle that you were in and glad we stumbled on this issue. Were you able to cleanly copy paste the code out of the cli dependency? Curious to know if you ran into any stumbling blocks there. We are looking it but not entirely sure how to approach this.

donmccurdy commented 1 year ago

The Binomial KTX2 WASM encoder can be found here:

They also provide example usage:

Open questions remain:

donmccurdy commented 1 year ago

Additional context. The KTX-Software includes a "libktx" WASM build, but that build does not include encoders. It could likely be extended to include Basis Universal encoders, but that route sounds more difficult (for me, as a JS developer) than using the Binomial encoders.

gz65555 commented 10 months ago

I have tested the Binomial encoder, and it works fine in the browser environment. However, it throws an error in Node.js, and I'm not sure why.

donmccurdy commented 10 months ago

It appears that libKTX builds do exist with encoders, or something like that is possible, as found in the glTF Compressor project: https://github.com/KhronosGroup/glTF-Compressor

However, it doesn't yet support mipmaps (https://github.com/KhronosGroup/KTX-Software/issues/464). That would be a blocker for my adding libKTX to glTF Transform, but it should still be fairly straightforward for users of glTF Transform to loop over textures and apply whichever encoder you prefer, in your own scripts.

javagl commented 10 months ago

I recently played a bit with KTX compression, and tried out the BinomialLLC encoder. Regarding the open questions:

Does the encoder work in both Node.js and Web?

It appears to work on the Web, meaning that the encoder test can be run in a browser (with a small bugfix for the compilation), and it does work in Node.js.

How much memory should be preallocated for the KTX2 buffer?

I'm not sure what this refers to. There seems to be some pre-allocation taking place on the level where the WASM is generated. But it probably refers to the "output buffer" that the data is written to. Some suspicious magic value is used in the test. I could imagine that a value of width*height*4*X (with X slightly larger than 1.0) should be fine, but ... yes, it feels like guessing at this point...

Are necessary options still available?

I don't know the exact meaning of the options, but the options that are supported can be derived from the wrappers definition.


I tried this out in the context of some experiments, where I locally created a glTF-Transform Transform that calls some utility class to compress the texture/image data in a glTF asset. This probably does not exactly match the API and behavior of the current toktx call of glTF-Transform (for example, there is no consideration for MipMapping and other things), but it shows that this should be possible in principle...

gz65555 commented 10 months ago

@javagl Hi, can you give me an example that BinomialLLC encoder works in Node.js? I have tried it but throws an error in Node.js.

javagl commented 10 months ago

@gz65555 Stuff is throwing errors. People are fixing errors. New errors show up. Rinse and repeat. That's how it is 🤷‍♂️

However, this is an example, containing the encoder built from https://github.com/BinomialLLC/basis_universal/tree/ad9386a4a1cf2a248f7bbd45f543a7448db15267, with the fix from https://github.com/BinomialLLC/basis_universal/issues/356 , which just writes an "empty" KTX file:

BasisUniversalExample-2023-09-11.zip

To be of any use, something like this should be wrapped and hidden, and preferably offered as some NPM library. (I mean, who's really up for building something with CMake and Emscripten and fiddling with some WASM modules, when the goal is actually just to have some result = Magic.encodeMy(image) call!?). For now, this "convenience layer" only exists as a KtxUtility in another project. Maybe it will exist as a standalone package in the future. Maybe in glTF-Transform ...? ;-)

donmccurdy commented 4 months ago

I've learned that loaders.gl has implemented their KTX2 compressor using the BinomialLLC encoder as well. See source code. It looks like they have code paths defined for both Node.js and web... so that's promising as a future direction. Not sure yet about installing from loaders.gl, or using a similar approach.

javagl commented 4 months ago

It's a pity that there is no plain, small, dependency-free, standalone binomial-ktx package in npm that just does what it is supposed to do. Implementors of tools and exporters will have to re-implement the same mechanisms again and again, with slight differences in the features and supported options, and everybody will have to deal with that pain in the back of integrating the WASM.

A short summary (not necessary complete - only what I have on the radar right now):

If glTF-Transform goes the path of (not using one of these things as a dependency, but) doing a custom WASM/encoder integration, it would at least be close to a "small, standalone package" (with some questions about the actual API). But having a package that just offers that output = encode(input, options) function would still be nice...

donmccurdy commented 4 months ago

One more for the list:

Appears to be meant for use on web only. Based on the BinomialLLC encoder and ktx-parse. Depends on a CDN.

Samsy commented 4 months ago

Hello everyone

@donmccurdy

We got this working for a while now, but wasnt following this thread anymore

I've sent here a gltf transform class that transform textures to KTX on a web environnement ( with an additional web worker support when supported )

Look at usage.js for the import and initialisation

I know this is not ideal, and not clean at all, It is a quick work, but maybe this can help

I can answer questions if needed

ktx.zip

javagl commented 4 months ago

@Samsy That file referst to the BinomialLLC encoders. Did you (also) encounter https://github.com/BinomialLLC/basis_universal/issues/356 when trying to encode to UASTC? (Just because it may be relevant in the broader sense, when considering the "robustness" of possible itegrations...)

gz65555 commented 4 months ago

@donmccurdy I'm about to support Node.js soon. The browser version can also be modified to not rely on the CDN.

A lot of our internal projects already rely on ktx2-encoder and glTF-Transform for texture compression. If I have time, I can submit a pull request.

donmccurdy commented 4 months ago

I'd be interested in trying out Node.js support when that's available! I think we'll have to compare some of these and decide what to do. I'd love to have a WASM-based solution, but am a bit worried WASM may be much slower than the (multi-threaded) CLI.

javagl commented 4 months ago

The stuff in the ktx subdirectory of the 3D Tiles Tools is currently focussed on Node.js, but should also work on the web - after all, it's only a thin wrapper with typings for the actual Binomial Encoder. (And there actually already is a utility function that returns a glTF-Transform Transform for that).

but am a bit worried WASM may be much slower than the (multi-threaded) CLI.

That may be true. But you're aware of the trade-offs here:

Option 1:

Option 2:

Option 1 is preferable, and I could imagine that most people don't care about whether encoding a model takes 1 Minute or 5 Minutes. (And whether it takes 1 minute or 10 minutes depends on the options for the encoding to begin with, like the qualityLevel and compressionLevel)

Of course, for the time-critical cases (e.g. someone who wants to batch-process 1000 large models), the option to use the real executable should still exist. But I assume that you intended to offer this option anyhow.

Maybe it then boils down to

donmccurdy commented 4 months ago

I don't anticipate maintaining bindings in glTF Transform for more than one KTX2 encoder, unless they're API-compatible. KTX-Software and the BinomialLLC encoder do not have compatible APIs. If we add support for a WASM encoder, it will replace the KTX Software CLI integration.

If WASM is ~2x slower that's probably acceptable for the benefits; 5-10x slower would be a tough sell I think.

javagl commented 4 months ago

The degree of compatibility is... a judgement call, in many ways. The toktx does have the quality and compression parameters that are passed to the command line call. The Binomial encoder does have qualityLevel and compressionLevel parameters that (seem to) serve the same purpose.

There already are places in the API that allow some soft of "different 'bindings'", so to speak, with these options like textureCompress({ encoder: sharp, /* Node only */ ...}. And I would have thought that something similar could make sense here. Roughly: Having some KtxEncoder that receives the required parameters, with two implementations, where one translates them into the command line parameters for toktx, and one translates them into the (sometimes only renamed...) parameters of the Binomial encoder. (These parameters may be the "intersection" of the parameters that are supported by both encoders, related to the point of "Are necessary options still available?" that you mentioned earlier)

But ... the forms of these encoders (executable vs. WASM) are so different, and require so specific infrastructures that the effort for maintaining both of them might not be justified.

About the performance... yeah. There is "The Truth", and there are "Benchmarks". But I was curious about that as well, and just did a very quick test (to be taken with the appropriate grain of salt):

The output:

Config quality_1_compression_0
  toktx 630.0328000001609 ms, size 98832
  wasm  1912.7081000003964 ms, size 98892
Config quality_1_compression_1
  toktx 761.4275000002235 ms, size 101899
  wasm  2918.5784999998286 ms, size 98703
Config quality_1_compression_3
  toktx 1578.139699999243 ms, size 116555
  wasm  5096.457899999805 ms, size 116287
Config quality_1_compression_5
  toktx 4076.6478000003844 ms, size 114959
  wasm  14401.164999999106 ms, size 115074
Config quality_16_compression_0
  toktx 636.5645000003278 ms, size 101870
  wasm  1462.8872999995947 ms, size 102370
Config quality_16_compression_1
  toktx 662.9963000006974 ms, size 101891
  wasm  2277.2050000000745 ms, size 99743
Config quality_16_compression_3
  toktx 1526.4877000004053 ms, size 110962
  wasm  5744.881999999285 ms, size 110924
Config quality_16_compression_5
  toktx 4155.633399998769 ms, size 112520
  wasm  19761.74410000071 ms, size 111158
Config quality_64_compression_0
  toktx 697.1180000007153 ms, size 113983
  wasm  1600.350700000301 ms, size 115679
Config quality_64_compression_1
  toktx 745.0608000010252 ms, size 116985
  wasm  2375.0175999999046 ms, size 116970
Config quality_64_compression_3
  toktx 1561.8352000005543 ms, size 129771
  wasm  6054.505800001323 ms, size 129950
Config quality_64_compression_5
  toktx 4216.207799999043 ms, size 129448
  wasm  20719.80779999867 ms, size 129345
Config quality_128_compression_0
  toktx 822.1613999996334 ms, size 134784
  wasm  1700.6072000004351 ms, size 134963
Config quality_128_compression_1
  toktx 795.0259000007063 ms, size 140748
  wasm  2571.4846999999136 ms, size 140780
Config quality_128_compression_3
  toktx 1685.3234999999404 ms, size 155790
  wasm  7601.217299999669 ms, size 155328
Config quality_128_compression_5
  toktx 4566.450499998406 ms, size 155914
  wasm  23932.182700000703 ms, size 155575
Config quality_255_compression_0
  toktx 771.0270000007004 ms, size 229829
  wasm  1727.5591000001878 ms, size 229957
Config quality_255_compression_1
  toktx 1002.704600000754 ms, size 214117
  wasm  3564.362199999392 ms, size 213981
Config quality_255_compression_3
  toktx 2019.6533000003546 ms, size 219728
  wasm  16216.512600000948 ms, size 219632
Config quality_255_compression_5
  toktx 4591.273299999535 ms, size 220669
  wasm  33441.15720000118 ms, size 220408

Looks like it's very roughly in the ~"3-5 times slower" ballpark, but high compression with high quality seems to be really expensive. (I could clean that project up and put it into a ZIP here, if someone cares...)


(Maybe related?) : I recently did some similar experiments, to see in how far the "noisiness" and "frequencies" that appear in the input image affect the size of the KTX output image. And I could imagine that these aspects do not only affect the size, but also the time that is required for encoding. But that's just another dimension in this multi-dimensional rabbit hole, and one can cross fingers and hope that the effect of this will be independent of the difference of EXE-vs-WASM...)


EDIT: Another aside: The fact that toktx does have to read/write 'temp' files is something to keep in mind. Having a "pure in-memory" solution might be desirable. The timing above does include the time for reading/writing the data, and it's nearly impossible to "isolate" that, but ... might also be negligible, compared to the actual encoding times.

clibequilibrium commented 3 months ago

I am very interested in seeing this work. My use case is GLTF optimization on the client before uploading to S3 bucket. Having a toKTX transform would help a lot.

gz65555 commented 2 months ago

From a practical usage perspective, using the wasm version to compress textures in a node.js environment is unacceptable due to its extremely slow speed(as demonstrated by javagl's test results).

In actual business operations, there are hundreds of textures, each with specific quality requirements.

clibequilibrium commented 2 months ago

Hi, any ETA or idea when KTX transform will be available outside of CLI ? Thanks

donmccurdy commented 2 months ago

@clibequilibrium no, sorry, I've pinned this under the "blocked or over budget" milestone. From my perspective there's no path forward here, unless a non-CLI KTX2 compression library becomes available that has comparable performance (+/- 50%) to the KTX-Software CLI.

It is always possible to access image data in a script, and pass it to your image optimization library of choice:

const document = await io.read('path/to/scene.glb');
for (const texture of document.getRoot().listTextures()) {
  const image = texture.getImage(); // Uint8Array
  // ...
}

Maintaining bindings for Sharp (JPG, PNG, AVIF, and WebP) and KTX-Software (KTX2) texture compression is all I am willing to support out of my own time — adding another KTX2 library to the project is too much. If other developers or companies have resources (time or funding) to dedicate to this, that's always welcome and please reach out.

clibequilibrium commented 2 months ago

@clibequilibrium no, sorry, I've pinned this under the "blocked or over budget" milestone. From my perspective there's no path forward here, unless a non-CLI KTX2 compression library becomes available that has comparable performance (+/- 50%) to the KTX-Software CLI.

It is always possible to access image data in a script, and pass it to your image optimization library of choice:

const document = await io.read('path/to/scene.glb');
for (const texture of document.getRoot().listTextures()) {
  const image = texture.getImage(); // Uint8Array
  // ...
}

Maintaining bindings for Sharp (JPG, PNG, AVIF, and WebP) and KTX-Software (KTX2) texture compression is all I am willing to support out of my own time — adding another KTX2 library to the project is too much. If other developers or companies have resources (time or funding) to dedicate to this, that's always welcome and please reach out.

Thanks a lot for your reply, totally understandable and makes sense. I see somebody linked ktx2encoder that would work in browser. I will use your suggestion and see if I can come up with something specific to my project. GLTF transform has been very useful to my platform to optimize gltf assets that the users import, thank you for your work 🙌