w3ctag / design-reviews

W3C specs and API reviews
Creative Commons Zero v1.0 Universal
326 stars 55 forks source link

Web Machine Learning Model Loader API #759

Closed wacky6 closed 1 year ago

wacky6 commented 2 years ago

Wotcher TAG!

I'm requesting a TAG review of Web Machine Learning (WebML) Model Loader API.

The WebML Model Loader API is a proposed web API to take a custom, pre-trained machine learning (ML) model in a standard format, and apply it to some data in JavaScript to perform inference, like classification, regression, or ranking. The idea is to make it easy and performant to use a custom, pre-built machine learning model in web apps, across devices and browsers.

Further details:

We'd prefer the TAG provide feedback as (please delete all but the desired option): 🐛 open issues in our GitHub repo for each point of feedback

cynthia commented 2 years ago

Quick skim, this looks fine. Assuming there is no path to agree on a standard format, two questions:

1) how does the caller know what kind of format they can .load()? What happens when .load() fails due to an unsupported format? 2) why is tensor creation a factory and not a constructor? Would a tensor be usable in other (non-ML) contexts as well?

wacky6 commented 2 years ago
  1. how does the caller know what kind of format they can .load()? What happens when .load() fails due to an unsupported format?

Caller can pass in modelFormat when they create context: ml.createContext({modelFormat: "tflite" }}).

If the backend doesn't support the format, it rejects the promise on createContext() and load().

  1. Why is tensor creation a factory and not a constructor?

Model Loader shares the same tensor definition with Neural Network API. Currently, tensor is duck-typed (fixed type array + shape).

Discussions on whether tensor should be strongly typed: https://github.com/webmachinelearning/webnn/issues/275 .

  1. Would a tensor be usable in other (non-ML) contexts as well?

As a general concept, a tensor is a fixed-size fixed-type arbitrary-dimensional array. Tensor computations are usually expressed using specialized methods (for performance reasons), instead of combining for-loops and basic arithmeric operations.

For Model Loader and Neural Network API (and perhaps future ML-related Web API), we aim to share the basic classes (like MLContext and tensor definition).

For non-ML context, I'm not sure if Tensor concept is widely used.

For non-ML Web API, I'm not sure if Tensor is more advantageous than TypedArray. To get the performance benefit, JavaScript probably needs to express computations using Tensor specific compute methods (instead of doing element-wise computation using loops), this requires further standardizations for each computation type.

For example, matrix multiplication, dot product, element-wise add, element-wise multiply need to be 4 methods.

torgo commented 2 years ago

Hi @wacky6 we're just reviewing today and one thing that came up was your mention of the idea of a "DRM-like solution". We appreciate that you're seeing this might be an issue - protecting models - however if you're going to seriously work on this we'd appreciate being involved because there are a lot of issues and complications with DRM, the differences between its use case and this one, and the web's security model. We suggest to leaving this for future work.

jbingham commented 2 years ago

Thanks, @torgo ! Agree that DRM is a very futuristic idea for ML models. It will be a crazy huge amount of work, and we definitely will need your help (and many others as well). Let's focus on the more immediate steps to get the API ready for an origin trial.

wacky6 commented 2 years ago

Thanks @torgo.

DRM in ML models is a novel topic, we'll definately work with TAG and wider industrial groups to work on that in the future.

We'll file dedicated TAG review requests (or propose as a saperate spec) considering the complexity and impact.

For now, let's focused on non-DRM, "open-source" format (e.g. Tensorflow / Torch / ONNX) use cases for this API.

cynthia commented 2 years ago

For the record, I am anything but pro-DRM.

The word "DRM" likely spooked people here, but unlike conventional DRM where you are 1) running mystery meat code for decryption (see: CDM) or 2) mystery meat code (which could be potentially malicious) that is protected (see: DRM protected executables which prevent inspection) - this (as of today) would mostly be a form of protected transport from an origin to the compute backend. As it would mostly be coefficients for known operations (e.g. dot products) it won't have the same level of implications as conventional DRM.

(There is the question of obstruction of interpretability in the context of ethics, W3C likely does not have the right expertise for that discussion.)

yuhonglin commented 2 years ago

Hi all, just want to point out that "Intellectual Property protection" is a super important topic in practice because lots of ML models cost a lot to train. If we want commercial companies to use their best ml models on the web, we have to figure out a solution for this.

And the model loader API is possibly the most possible candidate to have "IP protection". Comparatively, it is almost impossible for WASM/WebGPU/WebGL based ML solutions to have this. So if we can achieve it, it will be a killer feature for the model loader API.

I am not an expert on these topics (yet). But I suggest to investigate this whenever possible.

cynthia commented 1 year ago

We discussed this in our breakout yesterday - and our conclusion is that overall we are happy with this proposal. We still have some concerns around the factory vs constructor pattern - as it is not a pattern we encourage on the platform. See: https://www.w3.org/TR/design-principles/#constructors for more context.

That said, we don't feel particularly strongly about enforcing a change in the paradigm. Thank you for bringing this to our attention.