CesiumGS / 3d-tiles-tools

Apache License 2.0
315 stars 47 forks source link

Storage increase issue caused by b3dm to glb conversion #123

Closed HerculesJL closed 6 months ago

HerculesJL commented 7 months ago

Hello! I use the following commond to convert b3dm to glb, and the storage increase from 1475KB to 2081KB, is it normal? Is this normal? If not, how should it be resolved。

 npx 3d-tiles-tools convertB3dmToGlb -i .\Node_3941_-1.b3dm -o ./Node.glb

Node3941-1.zip

javagl commented 7 months ago

This should usually not be the case. There are some corner cases - omitting some details: B3DM may contain certain forms of data in certain formats where converting the data to standard glTF 2.0 can cause an increase in size. And there are some questions about how certain forms of compression from the legacy tile formats can be "mapped" to glTF.

But here, it is definitely a bug. I had a look at the GLB files, once the one that is extracted from the B3DM (with b3dmToGlb), and once the one that is converted to GLB (with convertB3dmToGlb), and ... it took a moment to spot this, but the problem is shown here:

Cesium Tools issue 123

The GLB uses the _BATCHID attribute that refers to the B3DM data. The accessor that is used there is always the same (accessor 14 in the image). But during the upgradeB3dmToGlb, the tools will convert each of these into a new accessor for the _FEATURE_ID_0 - so in the first primitive, this is accessor 11, and in the second primitive, this is accessor 17, and so on. This means that the former _BATCHID data is duplicated unnecessarily.


There could be some really simple "sledgehammer" solutions for that: I could just call the glTF-Transform dedup()/prune() functions after the migration to get rid of the duplicated data. But I'd lean towards trying to make the migration process itself "cleaner" at this point, and ensure that the conversion from _BATCHID to _FEATURE_ID_0 is only done once and the data that is then obsolete is properly disposed.


If this us "urgent" and you need a quick solution:

You could clone the repository as described in the Developer Setup, put this file as FixDuplication.ts into the root directory of the project...:

import fs from "fs";
import { GltfTransform } from "./src/tools/contentProcessing/GltfTransform";
import { dedup, prune } from "@gltf-transform/functions";

async function run() {
  const dir = "C:/Data/";
  const input = dir + "input.glb";
  const output = dir + "output-fixed.glb";

  const io = await GltfTransform.getIO();
  const inputGlb = fs.readFileSync(input);
  const document = await io.readBinary(inputGlb);
  await document.transform(dedup(), prune());
  const outputGlb = await io.writeBinary(document);
  fs.writeFileSync(output, outputGlb);
}

run();

(adjusting the path as necessary), and then run this with

npx ts-node FixDuplication.ts

This should deduplicate the data in the given file, resulting in one that has ~1.48MB.


This will not directly applicable for multiple files (or a whole tileset, as part of the upgrade command). But solving the underlying issue should not be too difficult, and I'll try to allocate some time for that and create PR to fix this soon.

HerculesJL commented 7 months ago

Thank you so much, I really appreciate your help.

javagl commented 7 months ago

I'll leave this open until the underlying issue is fixed, just to keep track of it.

javagl commented 6 months ago

While fixing this, I noticed another issue: Some very old legacy B3DM data can contain GLB where the batch ID attribute is not called "_BATCHID" but "BATCHID" (without underscore). The one with the legacy name was already handled as an input and properly converted into a feature ID. But when it had the legacy name, then it was not removed properly (i.e. the resulting primitive then contained both a _FEATURE_ID_0 and a BATCHID attribute)