CesiumGS / 3d-tiles

Specification for streaming massive heterogeneous 3D geospatial datasets :earth_americas:
2.13k stars 469 forks source link

Decouple spatial subdivision and metadata from 3D Tiles #519

Closed pjcozzi closed 2 years ago

pjcozzi commented 3 years ago

This is a roadmap item for an idea we have discussed but I don't think we have written down.

3D Tiles is basically:

3D Tiles Next is basically

As we look to metaverse use cases such as

3D Tiles has the opportunity to broadly solve spatial subdivision and metadata interoperability, e.g.,

  1. Spatial subdivision with USD as the tile payloads
  2. Spatial subdivision within a glTF file with glTF nodes as the payload
  3. glTF asset with just the metadata extensions to describe a car that can be imported into multiple metaverses
  4. USD asset using a metadata semantic extension and the core metadata type system, but a USD-appropriate encoding

I believe some of these use cases are already possible (3 for sure) and that others are just about there so this could be more about positioning, documenting, and inspiring than any major changes. Just asking that we have the decoupling thought through and the big metaverse ecosystem interoperability vision in mind.

CC @donmccurdy @lilleyse @ptrgags

ptrgags commented 3 years ago

1) we haven't talked about USD before, but there's nothing stopping one from defining a 3DTILES_content_USD (perhaps with a better name, to a newcomer that might sound like you're storing money in a tileset...) extension someday. 2) That's a big question, "should 3D Tiles 2.0 be a single binary file that's more glTF like?... should 3D Tiles 2.0 be a glTF + extensions?" 3) that can be supported with EXT_mesh_features today, just create a car class. For interoperability, A. The game industry would have to define what set of standard semantic values (e.g. CAR_PAINT_COLOR, CAR_MANUFACTURER, CAR_FUEL_EFFICIENCY) they want to use and B. would probably make sense to make an external schema JSON for the class definition so it can be shared. 4) I've never worked with USD files before so not sure the technical details here. One could define a USD extension to add metadata using concepts from the Cesium 3D Metadata specification. This would be similar to how EXT_mesh_features brings metadata capabilities to glTF.

Does USD makes sense in a 3D Tiles context? I think this needs more context and understanding of the USD format on my part. Is the goal still to stream tiles over a network? or is it meant to be packaged with the assets for a game client? is USD optimized for efficient transmission? what about runtime use?

pjcozzi commented 3 years ago

@ptrgags

Also I should have explicitly mentioned another metaverse / AR use case:

donmccurdy commented 3 years ago

One comment on (3) and "glTF asset with just the metadata extensions" —

EXT_mesh_features is geared toward "features", loosely defined as space-occupying subcomponents within an optimized mesh or texture. This will be very useful for metadata about details in complex scenes. When talking about metadata that describes the entire asset, or non-spatial concepts within it (materials, audio, etc.) we might be more in the territory of KHR_xmp_json_ld metadata. But either way, this can be represented.

pjcozzi commented 3 years ago

EXT_mesh_features is geared toward "features", loosely defined as space-occupying subcomponents within an optimized mesh or texture. This will be very useful for metadata about details in complex scenes. When talking about metadata that describes the entire asset, or non-spatial concepts within it (materials, audio, etc.) we might be more in the territory of KHR_xmp_json_ld metadata. But either way, this can be represented.

@donmccurdy roger that.

Let me back up on metadata in general.

I believe metadata in 3D Tiles Next is architected roughly with the following separation of concerns, from bottom of the stack to the top:

  1. Type system (Cesium 3D Metadata Specification)
  2. Encoding (JSON or binary) and granularity (vertex or texel or glTF object property / node or 3D Tiles tile/etc) (EXT_mesh_features and 3DTILES_metadata)
  3. Extensions that define dictionaries with actual semantics (e.g., Cesium Metadata Semantic Reference)

I think this is close to how we would want to separate concerns to allow an ecosystem of semantic metadata extensions, e.g., how to import a photogrammetry city model into different engines that includes the material types of the exterior walls, or importing a car model into different engines and knowing the coefficient of friction of the tires, but that (2) may be too coupled. Seems like there should exist logically (but perhaps not physically) this stack:

  1. Type system
  2. Encoding in JSON and binary
  3. Encoding at different granularities
  4. Encoding into different containers, e.g., glTF, 3D Tiles, or USD
  5. Semantic dictionaries

I think the raw concepts are all here but I am not sure that the current specs are decoupled in the right way to allow ecosystems of (4) and (5) to develop. I don't know that we need to do a big change at the moment, but this is something we should explore as we get community feedback on 3D Tiles Next and strive to architect something that can be a 10 year or more foundation for the metaverse.

As for XMP, we should leverage, interoperate, normative reference, message when to use each, etc. however appropriate based on XMP's strengths, use cases, and current ecosystem. Very open to recommendations and avoiding duplicate work.

donmccurdy commented 2 years ago

Thinking about (4) and (5) — at least as a thought exercise, it's helpful to compare the Semantic Reference to the XMP Dublin Core Namespace, or perhaps other namespaces. Can we cleanly map XMP namespaces in general to 3D Metadata semantics and encodings? Some of XMP's higher-level types, like Language Alternatives, are not so trivial for encoding. Below is a dc:title property value, analogous to NAME in 3D Metadata, except that it includes translations:

<xmp:Title>
    <rdf:Alt>
        <rdf:li xml:lang="x-default">XMP - Extensible Metadata Platform</rdf:li>
        <rdf:li xml:lang="en-us">XMP - Extensible Metadata Platform</rdf:li>
        <rdf:li xml:lang="fr-fr">XMP - Une Platforme Extensible pour les Métadonnées</rdf:li>
        <rdf:li xml:lang="it-it">XMP - Piattaforma Estendibile di Metadata</rdf:li>
    </rdf:Alt>
</xmp:Title>

I'm not sure what the JSON-LD encoding of that would be, but here's a rough guess:

{
    "dc:title": [
        {"lang": "x-default", "value": "XMP - Extensible Metadata Platform"},
        {"lang": "en-us", "value": "XMP - Extensible Metadata Platform"},
        {"lang": "fr-fr", "value": "XMP - Une Platforme Extensible pour les Métadonnées"},
        {"lang": "it-it", "value": "XMP - Piattaforma Estendibile di Metadata"}
    ]
}

XMP doesn't provide a binary encoding, which feels like a big limitation for 3D Tiles and glTF. I could imagine defining something based on CBOR Typed Arrays, Protocol Buffers, Flat Buffers, or similar efforts. These are mature enough — at least as libraries — but I'm less sure whether they're appropriate for use in standards. And none (with the possible exception of CBOR?) store metadata in a format appropriate for GPU vertex attribute upload.

EDIT: Perhaps a better encoding comparison would be Apache Arrow; the loaders.gl team has a helpful introduction from a web perspective: https://loaders.gl/arrowjs/docs. Pros: Balances support for the nested data types XMP would require, while maintaining SIMD- and GPU-friendly, columnar data layout. Cons: More complex than the binary encoding defined by the 3D Metadata specification, e.g. the apache-arrow JS library is 50kb minzipped.

pjcozzi commented 2 years ago

I think this issue is now duplicate with:

Solving the later two (or just one) should also answer how 3D Tiles implicit/explicit tiling can be decoupled from 3D Tiles in general.

@ptrgags @donmccurdy please comment in those issue if there is anything of substance here that would be better served in any of those issues. 🙏