CesiumGS / 3d-tiles

Specification for streaming massive heterogeneous 3D geospatial datasets :earth_americas:
2.1k stars 467 forks source link

3D-Tiles 1.1 Implicit tile about .subtree binary file #765

Closed cjhdsg closed 4 months ago

cjhdsg commented 4 months ago

I am currently working on implementing implicit tiles in 3D-Tiles version 1.1 using C++, specifically focusing on the .subtree binary files. These files contain binary chunk representing tile availability. I am seeking clarification on the data type (int, short, char, or others) used for storing this availability information.

Could you please provide insights into the data type utilized for storing tile availability in the binary section of .subtree files? Understanding this aspect is crucial for ensuring accurate interpretation and manipulation of the data.

Your guidance on this matter would be greatly appreciated.

javagl commented 4 months ago

The Implicit Tiling / Availability section in the specification describes the "entry point" for this data:

(Note: I just noticed a small error in the implementation note - see https://github.com/CesiumGS/3d-tiles/issues/766 - but this is unrelated to the broader question)

The crucial point in view of your question is given by the statement from the specification:

Availability bitstreams are packed in binary using the format described in the Booleans section of the 3D Metadata Specification.

The linked section describes how to interpret the raw bytes of such an availability buffer view. Specifically: The availability is encoded bit-wise inside the raw buffer view data. Each byte of the buffer view data contains 8 bits, indicating the availability of up to 8 elements.

Some pseudocode:

bufferData = ...; // some vector<byte>
bufferViewData = ...; // a subsection of the buffer data, stored as a vector<byte>

bool isTileAvailable(int index) {
    int byteIndex = index / 8;
    int bitIndex = index % 8;
    int bitValue = (bufferViewData[byteIndex] >> bitIndex) & 1;
    if (bitValue == 1) return true;
    return false;
}

For basic tests, you might consider loading a simple test data set like https://github.com/CesiumGS/3d-tiles-samples/tree/main/1.1/SparseImplicitQuadtree. It contains a subtreeInfo.md with very (very) detailed information about the data that is stored in the subtree, including, the exact bits that are set inside the respective availability bitstreams, for example, at https://github.com/CesiumGS/3d-tiles-samples/blob/main/1.1/SparseImplicitQuadtree/screenshot/subtreeInfo.md#tile-availability

cjhdsg commented 4 months ago

I've just understood the each tile's availability is stored in bits instead of bytes, and 8 bits are converted to bytes and stored in the .subtree files.

The crux of my confusion lies in how individual bits are organized and packed into bytes. I have attempted to perform numerical conversion, but it seems that the results do not align with the expected byte representation.

Could you please provide detailed insights into the process of converting bits to bytes for storing tile availability? Understanding this process is crucial for me to correctly interpret and manipulate the availability data.

Any guidance or clarification on this matter would be greatly appreciated.

Thank you for your assistance!

issue

cjhdsg commented 4 months ago

I have recently observed an interesting correlation when reversing an 8-bit sequence, such as "10110000", which results in "00001101". Remarkably, the hexadecimal representation of the reversed sequence is "0x0d", which aligns with the expected byte representation. I am curious to know whether this is merely a coincidence or if there is a deeper underlying principle at play.

Could you please provide insights into whether there is a fundamental relationship or principle that explains this phenomenon?

Thank you for your assistance in exploring this intriguing observation.

GatorScott commented 4 months ago

Endianness.

javagl commented 4 months ago

The term endianness usually refers to the order of bytes within a multi-byte type. Here, the question is about the order of bits within a byte.

I have recently observed an interesting correlation when reversing an 8-bit sequence, such as "10110000", which results in "00001101". Remarkably, the hexadecimal representation of the reversed sequence is "0x0d", which aligns with the expected byte representation. I am curious to know whether this is merely a coincidence or if there is a deeper underlying principle at play.

In the tables that are shown in that subtreeInfo.md (and that you showed in the screenshot), I did hesitate quite a while. I was afraid that either bit order could be confusing, depending on the understanding or expectation that a reader might have. So I tried to resolve the ambiguity here, by explicitly saying Bits [0...7] : in the description.

Usually, when you show the bits of one byte, the order will be 7 6 5 4 3 2 1 0 In a byte with value 6, the bits 2 and 1 will be set, meaning that it will usually be displayed as 0 0 0 0 0 1 1 0.

In the tables, the bits are shown with the order 0 1 2 3 4 5 6 7 (as indicated by the [0...7] part), meaning that the bits for a value of 6 will be 0 1 1 0 0 0 0 0.

Now you may ask why I did this. And the answer is that I tried to make these table rows more similar to the the images at https://github.com/CesiumGS/3d-tiles/tree/main/specification/ImplicitTiling#tile-availability , where the "Tile Availability" bitstream is shown from left to right (i.e the bit with index 0 is shown at the leftmost side)

When you take the bits from the tables, in the order in which they are shown, then they will be 10110000 01001100 10000000 matching the order of the bits in these images.

javagl commented 4 months ago

The "[0...7]" part of the description was intended to make clear which bit order is used for the description. If this is confusing, then it could be changed, but ... I wonder how it should be changed. So unless there are specific requests for changes, I assume that there are no actionable items here.