laurentlb / shader-minifier

Minify and obfuscate GLSL or HLSL code
https://ctrl-alt-test.fr/minifier/
Apache License 2.0
433 stars 31 forks source link

Generate a .kkp source map #361

Open therontarigo opened 3 months ago

therontarigo commented 3 months ago

The kkpView tool shows a report of executable compression, organized by symbols such as C++ functions or string constants (such as shaders). kkpView also displays source file+line information per-byte when included in the KKP file.

https://github.com/laurentlb/Shader_Minifier/commit/8f68b4893705927f67a1777574e01e874b418ce4 added support for exporting SYM files breaking a shader symbol down to sub-sections such as functions.

Since https://github.com/laurentlb/Shader_Minifier/commit/a2511dc729d3ca2605c5cee4bc43b07abef82b7e , the minifier records source line information, used by the JS web interface.

If a KKP file is generated, the same information may be provided to kkpView as well. This introduces a new use of the KKP file format: instead of representing a real executable file, it is representing one or more symbols only - the contents of the "described binary" are exactly the bytes of the minified shader/shaders. This may be viewed as-is in kkpView or combined with a KKP representing a real executable using kkpmerge.

https://github.com/ConspiracyHu/kkpView-public https://github.com/therontarigo/kkpmerge

The KKP format:

4 bytes: FOURCC: 'KK64'
4 bytes: size of described binary in bytes (Ds) -- size of minified shaders concatenated
4 bytes: number of source code files (Cc)

// source code descriptors:
Cc times:
    ASCIIZ string: filename
    float: packed size for the complete file -- equal to unpacked size
    4 bytes: unpacked size for the complete file, in int -- count of bytes referencing this source file

4 bytes: number of symbols (Sc)
// symbol data:
Sc times: -- one symbol per minified shader.
    ASCIIZ string: symbol name -- names match those used by c-variables output
    double: packed size of symbol -- equal to unpacked size
    4 bytes: unpacked size of symbol in bytes -- minified shader size
    1 byte: boolean to tell if symbol is code (true if yes) -- false
    4 bytes: source code file ID
    4 bytes: symbol position in executable

// binary compression data:
Ds times: (for each byte of the described binary)
    1 byte: original data from the binary
    2 bytes: symbol index
    double: packed size -- 1 byte (no packing)
    2 bytes: source code line
    2 bytes: source code file index
therontarigo commented 3 months ago

This is effectively #334 with a different format.

therontarigo commented 3 months ago

WIP hack here: https://github.com/therontarigo/Shader_Minifier/blob/kkpsourcemap/src/printer.fs (far from ready for a PR) This is abusing the printer's current ability to append line numbers to identifier names, but provides a working example of kkp file output.

Parsing emitted @line,col@ from the printer output is awkward: it would be better to record this information directly while the printer works, but this will require some refactoring.

The following applies also to #334 :

Ultimately all emitted tokens should provide line info, even when it must be contrived, for example: Input

    float varA = expr1;
    float varB = expr2;

Output

  float     // assign it line 1, even though both source lines declare floats
    a=expr1 // all tokens trivially line 1 (unless something is inlined here)
  ,         // it did not exist in source, but it can be line 1 (all declarations end in "," or ";")
    b=expr2 // all tokens trivially line 2 (unless something is inlined here)
  ;         // assign it line 2, even though both source lines were declarations

Also, it would be wise to keep in mind the future possibility of file numbers in addition to line numbers, in case of multiple source files, for example if the input uses #line number "filename" directive (from an external preprocessor) or if the minifier itself ever grows #include support. @line,col@ -> @line,col,filenum@