KhronosGroup / SPIRV-Tools

Apache License 2.0
1.08k stars 555 forks source link

SPIR-V module binary size / compression #382

Closed dneto0 closed 6 years ago

dneto0 commented 8 years ago

We've heard reports that SPIR-V modules are larger than those compiled to other representations.

First, SPIR-V binary encoding is extremely regular and is designed to be very simple to handle. It has lots of redundancy. For example, the SPIRV-Tools binary parser is simple and nearly stateless.

Second, Glslang generates binaries with OpName for many objects (see https://github.com/KhronosGroup/glslang/issues/316) Also, it doesn't attempt to use group decorations.

To make smaller binaries, we need to make tools smarter: Emit less redundant info in the first place, make tools to eliminate redundancy (but still produce valid SPIR-V binaries), and make semantically lossless compression and decompression.

This issue is a brain dump of a few ideas along these lines. (Keep in mind that the SPIRV-Tools must remain unencumbered, including a possible relicensing under the Apache 2 license.)

Random ideas include those that leave the result as valid SPIR-V binary:

Generic compression ideas:

Low level encoding ideas (stateless):

Stateful encoding:

Anyway, this is just a start of what we could do.

aras-p commented 8 years ago

Good stuff! In my mind, there's several goals, and several approaches here:

Given a SPIR-V program:

I'm playing around with various approaches for the last item -- the "varint" encoding you mentioned, also gonna try delta-encoding the IDs (in almost all the programs I have here, majority of IDs are 1-2 away from IDs used in previous instruction).

I have also seen some fairly unexpected stats from the shaders I have, e.g. OpVectorShuffle takes up a lot total space - it's a very verbose encoding (very often 9 words), in most cases just representing a swizzle or a scalar splat. Have to dig in more to find whether there are similar patterns that could be either encoded more compactly, or encoded in a way that's more compressible.

Themaister commented 8 years ago

With shader variants, you'd probably find that the same basic blocks are found over and over, with maybe just shifted IDs. Could be interesting to have an archive format where SPIR-V files would do something like OpLabelLink $hash $id-shift, and basic blocks could be inlined in runtime before passing to driver.

aras-p commented 8 years ago

Experimenting with varint encoding + some delta encoding. Results look quite promising. Messy code on github (https://github.com/aras-p/smol-v -- Win/Mac builds)

But, testing on 113 shaders I have right now (caveat emptor: all produced by HLSL -> d3dcompiler -> DX11 bytecode -> HLSLcc -> GLSL -> glslang, so they might have some patterns that aren't "common" elsewhere):

Compression: original size 1314.8KB
0 Remap       1314.1KB  99.9%  glslang spirv-remap
0 SMOL-V       448.3KB  34.1% "my stupid code"

Compressed with LZ4 HC compressor at default settings ("re" = remapper, "sm" - my test):
1    LZ4HC     329.9KB  25.1%
1 re+LZ4HC     241.8KB  18.4%
1 sm+LZ4HC     128.0KB   9.7%

Compressed with Zstd compressor at default settings:
2    Zstd      279.3KB  21.2%
2 re+Zstd      188.7KB  14.4%
2 sm+Zstd      117.4KB   8.9%

Compressed with Zstd compressor at almost max setting (20):
3    Zstd20    187.0KB  14.2%
3 re+Zstd20    129.0KB   9.8%
3 sm+Zstd20     92.0KB   7.0%

There's a lot of more instructions I could be encoding (so far just looked at the ones taking up most space), and perhaps other tricks could be done. The shaders I have do have debug names on them, I am not stripping them out.

(edit: updated with August 28 results)

dneto0 commented 8 years ago

@aras-p Very promising results! Thanks for sharing. Agreed: Transforms should be designed to make the result more compressible by standard (universal) compressors.

I also agree that we have to be mindful of using a reasonable tuning set of shaders. Here's a good project for someone: make a public repository of example shaders and define meaningful tuning sets over them.

Other encoding ideas:

   %a = ...
   ....
   ...
   %b = ....
   %sum = OpIAdd %int %a %b

into

  %a = ...
   ...
  %b = ...
  %sum = OpImplicitIAdd %int %a    ; will automatically use %b as the other operand

(This reminds of life writing assembly for the 6502.) The idea here is that instead of encoding the operand explicitly (even with delta coding), it just bloats the instruction opcodes slightly.

johnkslang commented 8 years ago

Good stuff, thanks.

I also want to put in a plug for another dimension to generate less SPIR-V to begin with.

There are two big ones SPIR-V is designed for:

  1. specialization constants
  2. lots of shaders in the same SPIR-V module

For the first, if multiple GLSL shaders are being generated with different values of constants (e.g., fixed sets of elements to process or bool turning features on/off), it is possible to instead make one GLSL shader with specialization constants and wait until app run time to provide the actual constant values needed:

shader A:

const elements vec4[4];
const bool feature = false;
... for (i = 0; i < 4; ++i) ...elements[i]...
... if (feature) ...

shader B:

const elements vec4[8];
const bool feature = true;
... for (i = 0; i < 8; ++i) ...elements[i]...
... if (feature) ...

Single shader with specialization:

layout(constant_id=1) const int numElements = 4; // can be changed to 8 at run time
const elements vec4[numElements];
layout(constant_id=2) const bool feature = false;  // can be changed to true at run time
... for (i = 0; i < numElements; ++i) ...elements[i]...
... if (feature) ...

This turns multiple shaders into a single shader, long before compression even comes into play.

For the second point, @dneto0 already touched on it with:

Link shaders together into a single SPIR-V module, to share common declarations (like types), and share helper function bodies.

It's possible that enough ID remapping and cross-file compression would recognize the commonality (would be good to find out how much that is happening), but if not two other approaches would help:

benvanik commented 8 years ago

As a place to look for inspiration the WebAssembly group may have some relevant ideas. They've heavily iterated on efficient instruction encodings that are easy to parse and compress well. They ended up with LEB128 varint encoding for most things, as well as some ways of reducing long instruction encoding (like the 9-word swizzle mentioned above). Some docs here, but if anyone's interested they could ping the group and chat - I'm sure they'd be willing to share what they learned along the way :)

johnkslang commented 8 years ago

From @aras-p:

...several approaches here: Given a SPIR-V program:

  • Make it more compressible, while still keeping it a valid SPIR-V that does the same thing. This is what spirv-remap does.

This is an important design constraint to be aware of. The question is whether anything not "off the shelf" is needed on the target (end user) system. Applies to both decompression and denormalization.

Also key is whether multiple files are seen just at compression time, or earlier at normalization/remapping time.

The fuller taxonomy is more like:

All combinations make sense. The remapper was indeed intentionally targeting the combination of

These are all constraints, and certainly lifting any of them would enable a tool to perform better.

So, I'm curious to what extent gains were seen by lifting the constraints and to what extent by finding more ways of doing better normalization.

aras-p commented 8 years ago

I wrote up what I did so far here: http://aras-p.info/blog/2016/09/01/SPIR-V-Compression/

And indeed, the combination I chose is somewhat different from the remapper. I did this:

Now, my "normalization/denormalization" step also makes it smaller, so you could view it as some sort of compression too. But it's not a dictionary/entropy compression, so you can still compress it afterwards with regular off-the shelf compressors.

johnkslang commented 8 years ago

Nice write-up, thanks.

dneto0 commented 7 years ago

@atgoo is contributing a codec to SPIRV-Tools. It's work-in-progress (in source/comp and tools/comp) and I've seen internal reports of quite good results.

antiagainst commented 6 years ago

@dneto, @atgoo: My impression is that the work for compression is basically complete right now. Should we close this?

atgoo commented 6 years ago

I would wait for feedback before doing any non-bugfix work, so it could be called completed.

antiagainst commented 6 years ago

Okay, I'll close this then.