immersive-web / model-element

Repository for the <model> tag. Feature leads: Marcos Cáceres and Laszlo Gombos
https://immersive-web.github.io/model-element/
Other
62 stars 11 forks source link

Is media= enough for LOD handling? (or do we need srcset?) #49

Open mikkoh opened 1 year ago

mikkoh commented 1 year ago

One thing I've been considering is how LOD handling should be handled on the web. Could we leverage <source srcset to define LOD swapping?

An example:

<model>
    <source 
      srcset="assets/example_200.usdz 200w, assets/example_1024.usdz 1024w"
      type="model/vnd.usdz+zip"
    >
</model>

The idea here would be that when the model is rendered at 200px or less in the viewport that the model from url assets/example_200.usdz would be rendered. Just to be clear I'm not talking about elements innerWidth but rather the render resolution of the model itself calculated from a bounding sphere.

It could be assumed that the lowest quality model is the one that would be rendered at the lowest resolution.

Arguing against this idea it could be argued that the 3D Model Format itself should define this functionality.

I've created a demo using canvas/webgl that has similar functionality that I can share at some point if it's helpful.

marcoscaceres commented 1 year ago

Arguing against this idea it could be argued that the 3D Model Format itself should define this functionality.

potentially, yes... do they?

I've created a demo using canvas/webgl that has similar functionality that I can share at some point if it's helpful.

That would be great!

Another thing to consider would be if <source media=""> attribute would be sufficient and if we really want to reuse the srcset machinery.

mikkoh commented 1 year ago

@marcoscaceres GLTF has at least one extension MSFT_lod. Pixar does do LOD handling in USD but nothing has been shipped in USDZ ASFAIK. @donmccurdy have you seen any other GLTF extensions for LOD handling?

donmccurdy commented 1 year ago

The proposal for KHR_mesh_variants would get us pretty close to LOD handling, too. That said — LODs are a very specific technical solution to a broader problem. Whether file formats adopt them or not, a <model> element may need a more general approach to adaptive loading and rendering.

Consider glTF extensions for compression — e.g. Draco, Meshopt, Basis — that either complement or exceed traditional HTTP transport compression like gzip and brotli for 3D content. Compression technology will improve over time, and browser support for existing and future extensions will vary. See https://github.com/WebKit/explainers/issues/75 for more discussion here. glTF headers identify when a particular extension is "optional" or "required" when loading an asset, and the user agent should ideally download compressed 3D scenes if possible.

While I don't know whether USD has a similar concept of extensions, I imagine similar complications might apply to running different versions of the USD Runtime in different browsers, or to different subsets of the broader USD format being supported in different browsers.

tl;dr — there are a lot of criteria affecting which asset a browser should load. If srcset is the right way to incorporate viewport size into that criteria then I don't see any problem there, but I suspect <source media="" ...> may end up doing more of the heavy lifting with these other criteria.

marcoscaceres commented 1 year ago

I suspect <source media="" ...> may end up doing more of the heavy lifting with these other criteria.

I agree. At least, I'd like to exhaust media="" and possible media queries before considering srcset.

marcoscaceres commented 1 year ago

Just to be clear, LOD handling might still be done dynamically by the format itself. This would be more for model selection based on, for example, prefers-reduced-data, or selecting a model to better suit the device's pixel density, or some other media feature.

cabanier commented 1 year ago

Just to be clear, LOD handling might still be done dynamically by the format itself. This would be more for model selection based on, for example, prefers-reduced-data, or selecting a model to better suit the device's pixel density, or some other media feature.

LOD should be left up to the UA. We can't make it observable how far the model is from the user because that would leak too much information. The UA should also be free to disable model rendering if the user walks away from the model. I looked at the MSFT_lod extension and it's unclear if that is the right approach. For instance, can animations be enabled/disabled based on it?

donmccurdy commented 1 year ago

MSFT_lod is certainly not a holistic approach to adjusting a model for every target device. It is a more specific optimization technique generally used for swapping particular objects within the scene for higher-detail versions as the camera gets closer to those objects, often used alongside other responsive performance techniques in games.

Per https://github.com/immersive-web/model-element/issues/49#issuecomment-1211322753, there are many reasons a model might be "costly". It may have be a large file requiring too much bandwidth, or an expensive material requiring too much GPU compute, or contain large textures requiring too much GPU VRAM, or have too many draw calls for the available CPU and graphics APIs. I do not think it is likely that fallbacks for all scenarios can/will be expressed and included within a single file, in any format. We probably need to allow the developer to provide options, and to express the differences between them somehow.

donmccurdy commented 1 year ago