Support building graphs from `MLTensor` containing constants

bbernhar commented 1 month ago

Demonstrate how MLTensor can be used to help web developers manage constant data (e.g., trained weights) on-device.

Dependent PRs

MLConstantOperand: https://github.com/webmachinelearning/webnn/issues/668#issue-2272156183
MLTensor: https://github.com/webmachinelearning/webnn/pull/754

Motivation

Allow constant data to be uploaded directly to the device, which is a capability that Execution Providers (EPs) leverage to prevent out-of-memory (OOM) errors (ORT example).
Re-use constant buffers in system memory between graphs, particularly for encoder-decoder models like Whisper.

Design

MLTensor containing constant data will be associated by name upon creating the MLOperand. At build(), the (un-optimized) constant data will be copied into the device. The original constant data (ie. ArrayBuffer input or uploaded device data held by MLTensor) can be discarded immediately once writeBuffer() is called and build() succeeds.

Example JS

// Upload constant data directly to device
constant_tensor = ctx.createTensor({usage: MLTensorUsage.GRAPH_CONSTANT, ...}, new Uint8Array(...), ...); // immutable

builder = new MLGraphBuilder(ctx);
constant = builder.input('myconstant', {dataType: constant_tensor.dataType, shape: constant_tensor.shape});
...
graph = await builder.build(outputs, {'myconstant', constant_tensor});

// Optional: free-up system memory
constant_tensor.destroy();

Proposed IDL

partial interface MLContext {
    Promise<MLTensor> createTensor(MLTensorDescriptor descriptor, optional ArrayBufferView sourceData);
};

partial interface MLGraphBuilder {
    Promise<MLGraph> build(MLNamedOperands outputs, optional MLNamedTensors constants = {});
};

Edits:

9/16: Added MLOperandDescriptor as required by MLOperand
9/18: Added constant-initializer to createTensor()
9/19: Reuse input(..) via constant usage flag

bbernhar commented 1 month ago

@a-sully @RafaelCintron @fdwr @huningxin appreciate any feedback

fdwr commented 1 month ago

constant_input -> constantInput (🚫🐍).

I'd need more time to think for meaningful feedback, but it may be rather confusing having this list of methods o_o:

graphBuilder.input()
graphBuilder.constant()
graphBuilder.constantInput()

bbernhar commented 1 month ago

I'd need more time to think for meaningful feedback, but it may be rather confusing having this list of methods o_o:

Thanks for the quick feedback. I simplified the proposal even further via initializer + new usage bit.

mmccool commented 1 month ago

Definitely interested in this from the point of view of caching models as well (especially weights that might be used by both WebGPU and WebNN implementations).

webmachinelearning / webnn