webmachinelearning / webnn

🧠 Web Neural Network API
https://www.w3.org/TR/webnn/
Other
382 stars 46 forks source link

Support building graphs from `MLTensor` containing constants #760

Open bbernhar opened 1 month ago

bbernhar commented 1 month ago

Demonstrate how MLTensor can be used to help web developers manage constant data (e.g., trained weights) on-device.

Dependent PRs

Motivation

Design

MLTensor containing constant data will be associated by name upon creating the MLOperand. At build(), the (un-optimized) constant data will be copied into the device. The original constant data (ie. ArrayBuffer input or uploaded device data held by MLTensor) can be discarded immediately once writeBuffer() is called and build() succeeds.

Example JS

// Upload constant data directly to device
constant_tensor = ctx.createTensor({usage: MLTensorUsage.GRAPH_CONSTANT, ...}, new Uint8Array(...), ...); // immutable

builder = new MLGraphBuilder(ctx);
constant = builder.input('myconstant', {dataType: constant_tensor.dataType, shape: constant_tensor.shape});
...
graph = await builder.build(outputs, {'myconstant', constant_tensor});

// Optional: free-up system memory
constant_tensor.destroy();

Proposed IDL

partial interface MLContext {
    Promise<MLTensor> createTensor(MLTensorDescriptor descriptor, optional ArrayBufferView sourceData);
};

partial interface MLGraphBuilder {
    Promise<MLGraph> build(MLNamedOperands outputs, optional MLNamedTensors constants = {});
};

Edits:

bbernhar commented 1 month ago

@a-sully @RafaelCintron @fdwr @huningxin appreciate any feedback

fdwr commented 1 month ago

constant_input -> constantInput (🚫🐍).

I'd need more time to think for meaningful feedback, but it may be rather confusing having this list of methods o_o:

graphBuilder.input()
graphBuilder.constant()
graphBuilder.constantInput()
bbernhar commented 1 month ago

I'd need more time to think for meaningful feedback, but it may be rather confusing having this list of methods o_o:

Thanks for the quick feedback. I simplified the proposal even further via initializer + new usage bit.

mmccool commented 1 month ago

Definitely interested in this from the point of view of caching models as well (especially weights that might be used by both WebGPU and WebNN implementations).