mrdoob / three.js

JavaScript 3D Library.
https://threejs.org/
MIT License
102.71k stars 35.38k forks source link

Adopt (or simply promote) a new true HDR file format leveraging JPEG/WEBP compression. Potentially alleviates HDR and EXR file size issues #27171

Closed daniele-pelagatti closed 11 months ago

daniele-pelagatti commented 1 year ago

Description

.hdr and .exr files are commonly used for generating HDR envMaps and scene backgrounds, they work nicely as long as you keep them within reasonable file size which, in turn, constraints their maximum resolution.

The webgl_materials_envmaps_hdr example is a perfect representation of this problem, the background is low resolution because otherwise it would need to download huge .exr files.

Some people (like @elalish) lamented the disadvantages of using traditional hdr files because of their huge file size.

Solution

We ported to javascript a new technology called Gain maps (also published on NPM).

Gain maps work by reconstructing an HDR image starting from 3 pieces of information:

  1. A normal 8 bit SDR Representation of the image
  2. Another 8-bit recovery image called gain map.
  3. Some metadata which tells the decoder how to apply the gain map to the SDR image

These 3 pieces of information can be kept separate or embedded into a single image file.

Keeping the files separate allows you to use additional compressed image formats, including webp.

Embedding the gain map into a single file is theoretically possible with jpeg, heic avif jpeg xl tiff and dng (see the spec at How to Store Gain Maps in Various File Formats) file formats but we currently implemented a wasm encoder/decoder only for jpeg, this is a fork of google's own ultra hdr file format lib compiled in wasm (see additional context below).

Additionally, we created a free online converter that allows to input an .exr or .hdr and convert it to a gain map. It also allows you to view already created gain maps as if they were conventional HDR files.

The online converter/viewer tool works entirely in your browser, encoding and decoding happen on your CPU and GPU.

For starters, we propose the integration of an an external example (like previously done with this one ), which leverages our Three.js Loaders.

I can create a pull request that adds such an examples if you like the technology and the implementation.

I'm also available to answer any questions you may have about our implementation and/or the technology in general.

Alternatives

RGBM is a comparable alternative but it has the following disadvantages:

  1. HDR range is limited to 3-4 stops, after that the technique starts fo to fall apart.

    Gain maps have no such limitations and are able to represent the full un-clipped HDR range (as far as Half Float precision goes).

  2. Requires a javascript PNG parser which is not very fast, big rgbm images can take a long time to parse.

    jpeg and webp Gain maps leverage the browser built-in decoding and the reconstruction of the HDR image is accomplished with a custom shader, which is near instant.

  3. PNG compression is not as good as jpeg and webp, file size is still an issue sometimes.

LogLuv is also a comparable alternative but:

  1. File size is not comparable to gain maps, does not really resolves the file size issue we are trying to address.
  2. like for RGBM, requires manual decoding in javascript, which can be slow for large images.

Additional context

Google is adopting the gain map technology in Android 14 but it refers to it as Ultra HDR Image Format and a JPEG file with embedded gain map is called JPEGR in their terminology.

Hence the terms Ultra HDR and Gain Map are effectively synonyms, this can be a little confusing but the technology is still evolving and standard names are not established yet.

Chrome has support for native JPEG gain map decoding in chrome (with initial avif gain maps support behind a flag chrome://flags/#avif-gainmap-hdr-images). This allows HDR displays users to visualize HDR content compressed in JPEG or AVIF.

Is is unclear if, in the future, Chrome's JS APIs will allow for natively obtain Uint16Array buffers starting from gain map images, for the moment we do it ourselves with our library.

References:

elalish commented 1 year ago

Excellent! I was just looking up info on the UltraHDR format, thinking how nice it would be if we had a polyfill to support it for three.js environment maps across platforms - it sounds like you've done just that, so thank you!

You refer to 10-bit HDR, but I would like to remind everyone here that there's TV HDR (where a few extra fixed-point bits is adequate) and then there's physical lighting HDR, which has many orders of magnitude higher dynamic range requirements. In three.js we store environment lighting as half-float textures on the GPU for this reason, and I've even seen clipping occasionally on these! I would recommend updating your example with true HDR lighting, e.g. Spruit Sunrise.

Chrome's API for Uint16Array buffers will be inadequate for our use case, but in your JS library it should be fairly trivial to return a half-float texture, right? And from my brief look through UltraHDR/GainMap tech, it seems perfectly capable of encoding arbitrarily extreme ranges too - is that your understanding as well?

I would love to support this effort in any way I can - certainly I would be happy to add this support to model-viewer and promote it with our users.

krispya commented 1 year ago

This is an exciting development. I have been leveraging KTX2 compressed textures lately, UASTC mostly but also BasisLZ, to save VRAM. Do you think GainMaps could work with compressed textures as well?

elalish commented 1 year ago

This is an exciting development. I have been leverage KTX2 compressed textures lately, UASTC mostly but also BasisLZ, to save VRAM. Do you think GainMaps could work with compressed textures as well?

I've actually been having a discussion with the folks who invented the KTX2 compression about this. It's a very different technology from JPEG, so the short answer I believe is no, but the good news is they are looking at their own way to compress HDR data, so stay tuned!

krispya commented 1 year ago

Thanks for the information.

arpu commented 1 year ago

@elalish maybe this is related? https://github.com/richgel999/png16/tree/main

daniele-pelagatti commented 1 year ago

@elalish

You refer to 10-bit HDR, but I would like to remind everyone here that there's TV HDR (where a few extra fixed-point bits is adequate) and then there's physical lighting HDR, which has many orders of magnitude higher dynamic range requirements.

I must admit I simply copy/pasted the format description coming from libultrahdr (which they now, of course, corrected themselves) which mentioned 10-bit, but the format itself supports un-clipped HDR ranges, hence is suitable for IBL workflows. I've edited the title accordingly.

In three.js we store environment lighting as half-float textures on the GPU for this reason, and I've even seen clipping occasionally on these! I would recommend updating your example with true HDR lighting, e.g. Spruit Sunrise.

Sure, i can update the example right away to show an un-clipped hdr range encoded with our tool. EDIT: Example is now updated using Spruit Sunrise

Chrome's API for Uint16Array buffers will be inadequate for our use case, but in your JS library it should be fairly trivial to return a half-float texture, right? And from my brief look through UltraHDR/GainMap tech, it seems perfectly capable of encoding arbitrarily extreme ranges too - is that your understanding as well?

Exactly! The format (and our library) already encodes an unlimited HDR range (uses Half Float Render targets and returns Uint16Arrays data, if needed), unless you choose not to, when using our online converter, under Encoding settings you can chose to limit the max content boost to an arbitrary number of stops, and the gain map will be clipped in that range. The default value is the full range of the input hdr/exr

I would love to support this effort in any way I can - certainly I would be happy to add this support to model-viewer and promote it with our users.

Thanks! our library and our online converter will stay free to use, if you find it useful you can integrate it as you please in model-viewer, we'd love that! Let us know if you have any problem with it, we are available for collaboration.

daniele-pelagatti commented 1 year ago

@krispya and @elalish

This is an exciting development. I have been leveraging KTX2 compressed textures lately, UASTC mostly but also BasisLZ, to save VRAM. Do you think GainMaps could work with compressed textures as well?

Well, technically, an HDR file could theoretically be reconstructed starting from two KTX textures (an sdr KTX and a gain map KTX), you can already do this this way:

  1. use our converter to create a "Separate data" gainmap and download these separate files
  2. convert both the sdr and the gain map files in ktx
  3. store the two ktx + json metadata somewhere

when you need to load the HDR image you can follow our https://github.com/MONOGRID/gainmap-js/blob/main/examples/decode-from-separate-data.ts example and replace the TextureLoader we used with a KTX loader.

The technology is very interesting because it allows you to use any 8-bit image format, as long as it is loaded by Threejs.

krispya commented 1 year ago

Someone with deeper knowledge can correct me, but I believe the problem with textures like KTX2 which are uncompressed on the GPU is that the decode process would also need to happen on the GPU as a compute shader.

daniele-pelagatti commented 1 year ago

@krispya

Someone with deeper knowledge can correct me, but I believe the problem with textures like KTX2 which are uncompressed on the GPU is that the decode process would also need to happen on the GPU as a compute shader.

Preface: I must admit that reconstructing an HDR image using KTX textures has not been tested at all.

Our implementation, though, reconstructs the full HDR range precisely on the GPU: when given two sdr images and the reconstruction metadata a simple shader, (not a compute shader) is sufficent

Our decoder, in fact, returns a WebGLRenderTarget which you can pass to another Material as map, envMap etc.

So this is theoretically feasible.

See both our examples w/o Loader and with a Loader and you'll notice we populate a material map using result.renderTarget.texture which never leaves the GPU.

The only issue we found is when you need to use the renderTarget.texture with EquirectangularReflectionMapping, in which case we found that simply doing

renderTarget.texture.mapping = EquirectangularReflectionMapping
renderTarget.texture.needsUpdate = true
material.map = renderTarget.texture

does not work, so you must request a conversion to DataTexture which is implemented with readRenderTargetPixels (contrary to what it says in the docs, it returns a Uint16Array for HalfFloatType render targets) this kinda defeats the purpose of KTX textures because it requires back and forth between the GPU and JS (plus the resulting DataTexture is not natively uncompressed on the GPU

maybe someone in the three.js team can shed light into why this happens.

Otherwise feel free to experiment and let us know your findings!

elalish commented 1 year ago

Thanks for updating your example - it looks great! I've just been testing your compression tool on some of my favorite HDR environments and I'm seeing a 10x - 20x improvement in file size in JPEG mode. I just checked bundlephobia and it says your npm module comes in at 6kb - not bad, but smaller would be great. I assume that's for both encode and decode? How small can we get it for just decode?

I would love to see a PR for this into Three - I have several users who need this desperately. It's so frustrating to put effort into compressing a GLB really well only to serve it with an environment image that's equally large.

elalish commented 1 year ago

Personally, I prefer the JPEG solution, as single-file is much easier logistically for editing and serving. I see this requires the UltraHDR wasm module. But all this does is parse a bit of metadata from the JPEG header, right? Seems easy enough to rewrite that as a tiny bit of JS. Or am I missing something?

daniele-pelagatti commented 1 year ago

Thanks for updating your example - it looks great! I've just been testing your compression tool on some of my favorite HDR environments and I'm seeing a 10x - 20x improvement in file size in JPEG mode. I just checked bundlephobia and it says your npm module comes in at 6kb - not bad, but smaller would be great. I assume that's for both encode and decode? How small can we get it for just decode?

I'd need to investigate, maybe separate the encode and decode functions under different exports, which is not a bad idea now that you mentioned it.

EDIT: Done in version 2.0.0

Personally, I prefer the JPEG solution, as single-file is much easier logistically for editing and serving.

keep in mind that editing a jpeg with an embedded gain map is not easily done at the moment. All current photo editing software will open the base SDR representation and discard the gain map.

Some notable exceptions are:

Plus, the simple act of sharing a gain map jpeg with an image sharing service (Twitter, Facebook, Slack, etc) , often leads to the loss of metadata which in turn means the HDR info is gone.

I see this requires the UltraHDR wasm module. But all this does is parse a bit of metadata from the JPEG header, right? Seems easy enough to rewrite that as a tiny bit of JS. Or am I missing something?

In order to decode a jpeg with embedded gain map you need to:

  1. Parse a new XMP metadata namespace, which is theoretically feasible in pure JS but for which I did not find any support in commonly used "pure JS" packages. All the test I've done using a plethora of npm packages led to no satisfactory results. I'm not an expert in this area so I gave up at a certain point.
  2. Extract the recovery gain map image which is stored (as a binary jpeg) using a "Multi Picture Format" jpeg tag. This has some example implementations in pure js but could potentially have issues regarding the speed in which this process can be accomplished in JS? I'm not sure myself.

given these factors I found it easier to just compile the libultrahdr in wasm and let it handle these two steps. Most importantly the wasm also allows it to "pack" sdr + gain map +metadata into a single jpeg which is needed in the encoding process, I felt it was too ambitious to try and write all of this myself ad just used the wasm instead.

Having said this, I'm completely open to suggestions and blazing fast pure js implementations :) , I must admit I'm not overly fond of the wasm module myself.... it's just the fastest and most effective way of accomplishing the task I've found so far.

EDIT: scratch that, I managed to get rid of the wasm for extracting the XMP Metadata end the gain map image.

The whole extraction process in pure js lasts ~158ms for a 16k Equirectangular JPEG image which is the maximum supported resolution of our library.

For comparison, the texSubImage2D call for uploading the same texture on the GPU lasts ~1176ms on my machine so the parsing speed is not that bad.

I've updated the example with the new pure js implementation, published a new 2.0.0 version on npm and bundlephobia now reports a minified +gzipped size of 4.2kB

donmccurdy commented 1 year ago

I don't know if KTX2 pairs of SDR+gain files is general-purpose enough that we'd want to make the required changes throughout three.js to support them, though it could be implemented in userland and would be interesting to compare. Note: It is critical that compressed textures in KTX2 containers remain compressed on the GPU.

For a while now, I've wished we had a practical way to produce KTX2 files using BC6H compression, which remains compressed on the GPU, and has runtime advantages not available by any other method in this thread. Not all platforms support BC6H, but Khronos has published lightweight decoders that can be used to produce f16 data when runtime support isn't available.

I imagine BC6H complements the approach here well – you might, for example, use libultrahdr when network bandwidth is the top concern, and KTX2 when VRAM budgets or texture upload without dropping frames are required.

I think 6kb is an excellent tradeoff for the savings libultrahdr provides, and honestly — I don't think I've ever seen a useful WASM module packaged that small before, great work!

daniele-pelagatti commented 1 year ago

@donmccurdy

I don't know if KTX2 pairs of SDR+gain files is general-purpose enough that we'd want to make the required changes throughout three.js to support them, though it could be implemented in userland and would be interesting to compare. Note: It is critical that compressed textures in KTX2 containers remain compressed on the GPU.

the KTK discussion was bit off topic, the main purpose of our library is to serve jpeg (or separate webp, it you really really want to save some more file size) files with full-range hdr capabilities.

It is theoretically possible to reconstruct a full range HDR HalfFloat render target using two KTX files but it was never tested at all. It is possible because our decoder is effectively a shader, so it can return a renderTarget and, renderTarget.texture can be passed around (with some limitations)

I think 6kb is an excellent tradeoff for the savings libultrahdr provides, and honestly — I don't think I've ever seen a useful WASM module packaged that small before, great work!

hold on the compliments :) 6kb is only the JS part, the wasm itself is approx 168kb gzipped but.

Speaking of wasm modules, @elalish 's words resonated with me over the weekend so... good news! I'm preparing an upgrade of the library with a pure-js implementation of the jpeg decoder, so: no wasm for general usage!

I think we'll keep the wasm only for the encoder part (which i'm in the process of separating from the decoder part, also following @elalish 's suggestion).

So stay tuned, I'll publish a new npm package + example soon and, once it's done, I was thinking of opening a PR with our example.

donmccurdy commented 1 year ago

hold on the compliments :) 6kb is only the JS part, the wasm itself is approx 168kb gzipped...

Still great compared to the cost of HDR files, and comparable to the binaries we're using to decode Basis and Draco compression, so I'm not withdrawing the compliment. :)

I'm preparing an upgrade of the library with a pure-js implementation of the jpeg decoder, so: no wasm for general usage!

Will be interested to check that out!

elalish commented 1 year ago

Excellent work! This looks like everything we need for an efficient decoding solution in three. Thanks for reducing dependencies!

daniele-pelagatti commented 12 months ago

@krispya

You have sparked my interest on the KTX topic: I did a quick test with 8k KTX textures reconstructing an HDR image using the gain map technique, seems to work nicely!

https://monogrid.github.io/gainmap-js/ktx.html

if the decoded renderTarget could be used directly with EquirectangularReflectionMapping as scene.background the trick would be completely done in the GPU, I repeat, I'm not sure why it can't be used that way, I'm sure there's a good reason.

Another quirk is that the KTX texture seems flipped on the Y axis, I'm flipping the scene.background texture but I'm not sure if the envMap is generated upside down (also the debug plane seems white?)

I've read somewhere KTX textures need to be flipped manually, I'll see what I can do on my side in the decoding shader maybe.

elalish commented 12 months ago

Keep in mind that KTX doesn't really help here - it's not any smaller over the wire, it's only smaller in GPU memory. However, since we have to process these images on the GPU and the GPU can't write to compressed textures, there is no GPU memory savings. In this case JPEG will actually give better performance.

Mugen87 commented 11 months ago

With #27183 and #27230 merged, I guess this issue can be closed. Excited to see Ultra HDR with r159!