CesiumGS / cesium

An open-source JavaScript library for world-class 3D globes and maps :earth_americas:
https://cesium.com/cesiumjs/
Apache License 2.0
12.94k stars 3.49k forks source link

GPU Metadata Property Table Packing for 3D Tiles Next #9572

Open ptrgags opened 3 years ago

ptrgags commented 3 years ago

One of the upcoming parts of our 3D Tiles Next effort is to pack metadata (specifically, feature tables) for use on the GPU. This will be necessary for both custom shaders (see https://github.com/CesiumGS/cesium/issues/9518) and GPU feature styling.

Packing Overview

The goal for this subsystem is to take the metadata from the CPU, pack it into GPU memory (textures, attributes and uniforms), and then unpack it in the shader.

Only properties used in the shader code will be uploaded to the GPU. @lilleyse’s model-loading branch will have a way to determine this.

Once uploaded and no longer needed on the CPU, try to free the CPU resources. We should include some options for controlling this.

We also want to make any texture management general-purpose, as the refactored Model.js will use other types of textures (feature textures, feature ID textures).

Datatype Compatibility

Not every data type is GPU-compatible. For example, STRING and variable length ARRAY are not easily representable on the GPU. Also, 64-bit types are not directly representable, but a fallback would be to convert them to 32-bit types.

Furthermore, WebGL 1 only supports 8-bit integer or 32-bit float (with OES_texture_float) textures. For larger integer types, multiple image channels or multiple pixels will have to be used.

Supported Types

Supported with Fallbacks

Not supported

Other Notes:

Encoding Considerations

There are some special cases where values need additional encoding:

Choosing a GPU layout

The main unknown right now is how to choose an optimal GPU layout. The calling code will provide a list of properties and information about what GPU resources are available. The layout algorithm needs to take this information and determine what textures/vertex attributes/uniforms to use to store the metadata.

One possibility is to divide the properties into three categories:

However, determining the exact layout is more involved. Here are some complicating factors:

Inputs:

Output:

This layout can be used by the caller to set the Property struct in the shader, as well as determine where/how to upload data to the GPU.

Stretch Goal: Filtering

One detail that would be nice to have is to allow a method to let the user filter properties. This has a number of benefits:

Potential downsides:

To Do:

ptrgags commented 3 years ago

Yesterday I discussed some details about textures with @lilleyse, here are some notes from that:

rowsPerFeature = ceil(featureCount / maximumTextureWidth)
actualTextureWidth = ceil(featureCount / rowsPerFeature)
ptrgags commented 3 years ago

Proposed Metadata Packing Algorithm

At a high-level, the algorithm will have the following phases:

  1. Partition properties into the different types of WebGL concepts (textures/attributes/uniforms/constants)
  2. Determine if data types are representable on the GPU. If not, throw an error
  3. Determine the data type that will be used on the GPU and a list of steps needed to convert/pack the data for the GPU
  4. Group properties into textures/attributes by size. This can be packed tightly (storing multiple properties in a single texel/vec4 for better memory usage) or packed loosely (separate texels/attributes for easier interpolation). A flag should control this.
  5. Compute the exact layout including any byte/texel offsets as needed.
  6. "vacuum pack" the textures/attributes, e.g. choose smaller texture dimensions to avoid wasting memory, don't use unused texture channels, etc.

Partitioning Properties

For this first iteration, let's keep this simple using the rules I mentioned in the description. To recap:

  1. defaultValue properties are constants. They will be inlined in the shader code.
  2. Tileset/group/tile properties are constant over every vertex, but vary from content to content, so store them in uniforms
  3. For per-vertex properties (constant: 0, divisor 1), use attributes
  4. Any other per-feature properties will use textures

In theory, we might want to fallback between textures <-> attributes, but I'll hold off on this for this first iteration.

Type Representability

This step is very simple, it simply rejects the following types as "not representable" - any other issues like lack of floating point texture support will be caught in the next step.

Computing Packed Types

This is the most involved phase of the algorithm. Essentially we want to go from a list of property types to a list of (packedType, channelCount: int, packingSteps: PackingFunction[]). This process varies depending on the destination (constant/uniform/attribute/texture), as WebGL has different rules for what types are allowed.

Packing functions are any steps that are needed to do to prepare the values for packing. They will be applied in order when packing, and the inverse will be performed in the shader to unpack the values.

Some packing types require a lossy conversions. We might want to log an error or throw an error when this happens.

Several types have similar packing rules, so here are some rules for converting these into a smaller set of types. These operations are added as packing rules. The following tables summarize these rules.

Notes:

Constant/Uniform Type Conversions:

Type Converted Type Packing Function Lossy
AnyScalarType ARRAY[AnyScalarType, 1] promoteScalarToArray No
ARRAY[INT(8/16), N] ARRAY[INT32, N] promoteToInt No
ARRAY[INT64, N] ARRAY[FLOAT32, N] convertInt64ToF32 Yes
ARRAY[UINT(8/16), N] ARRAY[UINT32, N] promoteToUint No
ARRAY[UINT64, N] ARRAY[FLOAT32, N] convertU64ToF32 Yes
ARRAY[FLOAT64, N] ARRAY[FLOAT32, N] convertF64ToF32 Yes

At the end, only these families of types will remain: ARRAY[INT32, N], ARRAY[UINT32, N], ARRAY[FLOAT32, N], ARRAY[BOOLEAN, N]. They are translated to GLSL as following:

Type GPU Types
ARRAY[INT32, N] int/ivec2/ivec3/ivec4
ARRAY[UINT32, N] uint/uvec2/uvec3/uvec4
ARRAY[FLOAT32, N] float/vec2/vec3/vec4
ARRAY[BOOLEAN, N] bool/bvec2/bvec3/bvec4

Attribute Type Converted:

Type Converted Type Packing Function Lossy
AnyScalarType ARRAY[AnyScalarType, 1] promoteScalarToArray No
ARRAY[(U)INT(8/16), N] ARRAY[FLOAT32, N] convert(U)IntToF32 No
ARRAY[(U)INT(32/64), N] ARRAY[FLOAT32, N] convert(U)IntToF32Lossy Yes
ARRAY[BOOLEAN, N] ARRAY[FLOAT32, N] reinterpretBooleanAsF32 No
ARRAY[FLOAT64, N] ARRAY[FLOAT32, N] convertF64ToF32 Yes

At the end, only the ARRAY[FLOAT32, N] family of types will remain. They are translated as float/vec2/vec3/vec4 in GLSL

Texture Type Conversions

Type Converted Type Packing Function Lossy
AnyScalarType ARRAY[AnyScalarType, 1] promoteScalarToArray No
ARRAY[(U)INT64, N] ARRAY[FLOAT32, N] convert(U)Int64ToF32 Yes
ARRAY[INTx, N] ARRAY[UINTx, N] reinterpretSignedAsUnsigned No
ARRAY[BOOLEAN, N] ARRAY[UINT8, N] reinterpretBooleanAsU8 No
ARRAY[FLOAT64, N] ARRAY[FLOAT32, N] convertF64ToF32 Yes

At the end, only these families of types will remain: ARRAY[UINT(8/16/32), N], ARRAY[FLOAT32, N]. The packed type is a little more involved, as it depends on whether float textures are supported via the OES_texture_float extension. Some types have a fallback when this is not available, others will throw errors. In some cases, more packing functions are needed.

Type With OES_texture_float OES_texture_float unavailable Packing Functions
ARRAY[FLOAT32, 1] FLOAT32 texture, 1 channel UINT8 texture, 4 channels packFloatAsRGBA (without float textures)
ARRAY[FLOAT32, N] FLOAT32 texture, N channels Unsupported None
ARRAY[UINT8, N] UINT8 texture, N channels UINT8 texture, N channels None
ARRAY[UINT16, 1] FLOAT32 texture, 1 channel UINT8 texture, 2 channels packUint16AsFloat32 or packUint16As2Channels
ARRAY[UINT16, 2] FLOAT32 texture, 2 channels UINT8 texture, 4 channels packUint16AsFloat32 or packUint16As2Channels
ARRAY[UINT16, N] FLOAT32 texture, N channels Unsupported packUint16AsFloat32
ARRAY[UINT32, 1] FLOAT32 texture, 1 channel (lossy) UINT8 texture, 4 channels (not lossy) packUint32AsFloat32 or packUint32AsRGBA
ARRAY[UINT32, N] FLOAT32 texture, N channels (lossy) Unsupported packUint32AsFloat32

Grouping Properties by Size

Note: in what follows, when I say "group properties" I am not referring to group metadata from 3DTILES_metadata, but grouping properties together by size for space efficiency.

The next step is to group properties together into a single texel/vector to conserve space.

Note: this step is optional, it should be controlled by a boolean flag. It's nice for memory efficiency, but will not be useful when interpolation is needed.

There are only 5 partitions of 4:

4
3 + 1
2 + 2
2 + 1 + 1
1 + 1 + 1 + 1

We can use this fact to pair up components to pack memory more densely:

  1. (Textures only) - partition properties into properties packed as FLOAT32 textures and properties packed as UINT8 textures. The following steps will apply to each type of texture separately
  2. Bin the properties by their number of components needed (1, 2, 3, or 4)
  3. Add the list of 4-component properties to the output list
  4. For each 3-component property, pair it with one of the 1-component properties (if available). Either way, add it to the output
  5. For each 2-component property, pair it with either another 2-component property, or up to 2 1-component properties (where possible). Either way, add it to the output
  6. Group the remaining 1-component properties in groups of 4 (as closely as possible) and add to the output.

For example, if I had (property, channels) = (A, 1), (B, 2), (C, 2), (D, 4), (E, 3), (F, 3), (G, 1), (H, 3), the algorithm would work like this:

After binning:
1: A, G
2: B, C
3: E, F, H
4: D

After handling 4-components:
output = [D]

After handling 3-components
output = [D, [E, A], [F, G], H]

(note that there's nothing to pair with H so the texel will have an unused 4th component)

After handling 3-components
output = [D, [E, A], [F, G], H, [B, C]]

After handling 1-components
output = [D, [E, A], [F, G], H, [B, C]] (no changes needed)

Compute Layouts

For uniforms, each group of properties becomes a single uniform.

For attributes, each group of properties becomes a single attribute.

For textures, it's a little more involved. Each group of properties becomes a single texel, but there are a couple different ways these texels can be arranged:

  1. (my original idea) Each group of properties gets a number of rows of the texture propertyHeight = ceil(featureCount / textureWidth), then texels are accessed by
row = propertyIndex * propertyHeight + floor(featureId / textureWidth)
column = featureId % textureWidth`. 

where propertyIndex would be computed for each property

  1. (@lilleyse's suggestion) treat the properties as one big 1D array and wrap by the texture size:
index = propertyOffset + featureId
row = floor(index // textureWidth)
column = index % textureWidth

Where propertyOffset is computed for each property.

I think Option 2 is nicer for its simplicity and better memory efficiency for multiple feature tables.

NOTE: In the above, assume textures are the maximum size and 4 channels. The next step will handle shrinking this layout to fit the content tightly, this is to be done at the end.

"Vacuum Packing"

To finish the layout, we want to avoid wasting memory, so reduce dimensions of the data to fit the data as tight as possible. This involves:

  1. (Textures only), balance the texture dimensions to minimize unused texels. If there are N texels in use, this can be done with the formula:
rows = ceil(N / maximumTextureWidth)
columns = ceil(N / rows)

For example, say maximumTextureWidth = 10 and N = 11, we have:

Original texture use: 10 x 2, 9 pixels wasted:
1111111111
100000000

rows = ceil(11/10) = 2
columns = ceil(11 / 2) = 6

Result: 6x2, only 1 texel wasted:
111111
111110
  1. (Textures) Not sure if this step is needed depending on the texture layout used, but ensure the height of the texture is exactly enough to fit all the used texels.
  2. Crop the number of channels if not all 4 are needed.

    • Texture example: Suppose there is a texture with a single property that requires 2 components (such as an ARRAY[UINT8, 2]), use a LUMINOSITY_ALPHA texture rather than a RGBA texture

    • Attribute/uniform example: Suppose we only need 2 components (e.g. 2 UINT8 properties packed tightly into a single attribute), use a vec2, not a vec4

ptrgags commented 3 years ago

Oh one clarification: when it comes to grouping properties by size, this needs to be done per-type. So for example, when it comes to textures, the FLOAT32 properties are grouped together, while the UINT32 ones are handled separately.

sanjeetsuhag commented 3 years ago

Learned this while reviewing https://github.com/CesiumGS/cesium/pull/9595 - WebGL 1 does not have uint because it uses GLSL 100. WebGL 2 supports uint because it uses GLSL 300.

lilleyse commented 1 year ago

Requested in https://github.com/CesiumGS/cesium/issues/11450.

chen21439 commented 1 year ago

Requested in #11450.

thank you for your reply. now i am trying to shake every building differently . i once thought the metadata seem a good choice to distinct them ,can you give me some advice on how i can distinct each building .

https://sandcastle.cesium.com/?src=Custom%20Shaders%203D%20Tiles.html&label=3D%20Tiles%20Next

in this example,a batch building share the same featureId