Web Neural Network API - Githubissues

anssiko commented 3 years ago

Hi TAG!

I'm requesting a TAG review of the Web Neural Network API.

The Web Neural Network API (or WebNN API in short) is a specification for constructing and executing computational graphs of neural networks. It provides web applications with the ability to create, compile, and run machine learning networks on the web browsers. The WebNN API may be implemented in web browsers using the available native operating system machine learning APIs for the best performance and reliability of results.

Explainer: https://github.com/webmachinelearning/webnn/blob/master/explainer.md
Specification URL: https://webmachinelearning.github.io/webnn/
Tests: mocha tests, plan to migrate to wpt testharness.js
Security and Privacy self-review: https://github.com/webmachinelearning/webnn/issues/119 (related: it was suggested the W3C TAG to drive coordination on the larger question of permission model for compute-heavy APIs (WebNN, WebGL, WebGPU, Wasm) from a platform-wide perspective)
GitHub repo: https://github.com/webmachinelearning/webnn
Primary contacts:
- Ningxin Hu (@huningxin), Intel, Editor
- Chai Chaoweeraprasit (@wchao1115), Microsoft, Editor
- Anssi Kostiainen (@anssiko), Intel, Chair
Organization(s)/project(s) driving the specification: Machine Learning for the Web Community Group
Key pieces of existing multi-stakeholder review or discussion of this specification: Web and Machine Learning workshop report and spec GH issues
External status/issue trackers for this specification:

Further details:

[x] I have reviewed the TAG's API Design Principles
Relevant time constraints or deadlines: We appreciate feedback by the end of 2020.
The group where the work on this specification is currently being done: Machine Learning for the Web Community Group
The group where standardization of this work is intended to be done: Web Machine Learning Working Group (see advance notice)
Major unresolved issues with or opposition to this specification: Appropriate API abstraction level discussed in WG Charter GH repo
This work is being funded by: N/A

You should also know that...

[please tell us anything you think is relevant to this review]

We'd prefer the TAG provide feedback as:

🐛 open issues in our GitHub repo for each point of feedback

anssiko commented 3 years ago

The Machine Learning for the Web Community Group congratulates @cynthia for his re-election to the TAG and looks forward to the TAG review comments :-)

anssiko commented 3 years ago

Discussed this briefly with @kenchris who kindly volunteered to share his high-level review comments for the explainer:

Consider making use cases more prominent in the explainer, perhaps note the use cases (as bullets?) in the beginning (currently use cases are noted in the explainer key scenarios and linked from the spec header)
Note in the explainer (or in the TAG review request?) there is a spec-compliant polyfill that passes the test suite as well as samples that implement selected use cases using this polyfill (maybe the review request template could include new fields for polyfill and samples?)
Note this effort has rather diverse participation, including major browser vendors, key ML JS frameworks, interested hardware vendors, web developers
Note the design process of this API started by identifying key use cases, working down the levels of abstraction decomposing the key use cases into requirements, aligned with the guidance to put user needs first

I probably missed some of @kenchris insights, so please fill me in.

kenchris commented 3 years ago

I am looking at this with @cynthia now, but here are some of my comments from yesterday:

Yes, I definitely think the explainer should better explain the use-cases and quickly introduce the major new terminology such as Neural Network, AI, Model Loader etc.

Then it should clearly explain the pros/cons with each of the approaches (bullet points would be nice), so that it is clear that even if pursuing a model loader right seems complicated due to no standardized format, it also does not mean that a neural network API will be useless when that exists.

Also when you have the use-cases, it would be nice to be able to see what of the available options (model loader, neural network etc) would solve the use-cases and which ones doesn't, like "training" won't be solved with a module loader.

Also as some of this could be implemented / polyfilled with WASM, WebGL, WebGPU, that discussion seems important. In the explainer there are argumentations to why this might not be a good solution, but existing libraries work on top of this, so do they also suffer from all these issues you are listing? Maybe some look at the performance or battery efficiency of this new approach would be appropriate

cynthia commented 3 years ago

@kenchris and I looked this today.

First-pass review - we have a bunch of questions:

The fact that a GRU is in there really sticks out. I somehow found out why it is there, but it feels extremely inconsistent with the rest of the API which is fairly generic. (e.g. you should have a LSTM and a GRU, but not just a GRU - that's weird.)
In the spec, some of the activations are out in global scope (e.g. relu), some are in unary operators (sigmoid, tanh) - this doesn't look consistent.
The spec mentions training in the batch normalization section - but I'm fairly convinced that there is no support for training. Is this an error?
getNeuralNetworkContext() and createModelBuilder() seem strange (no parameters, for one thing) - is this expected to accept parameters/configs at some point? If so, we'd like to see what is intended here.
Wouldn't it make sense to have a constructor rather than a builder pattern for createModelBuilder()? (e.g. new ModelBuilder(navigator.ml.getNNContext());
I see quite a few view/reshape like functions, which of these are expected to copy and which are not? Probably good to note this in the spec.
If there are layers that will be taking activations as string enums, there should simply be a string enum for activations rather than have it just in RecurrentNetworkActivation. (One may argue that hyperbolic tangent is RNN specific, but...)
While the limitations of JavaScript probably contribute a lot to this, but the ergonomics of this API based on example code might have room for improvement.
It feels like errors/exceptions should probably fleshed out. (e.g. what happens when you try to reduce on a non-existent axis?)
I don't quite understand the NamedOutput mechanism. What if what is output just a feature?
A lot of the names are very generic (Operand, Compilation) - this feels like something we might want to prefix with something or synchronize with TC39 about.
What's the isomorphic JS story for this? Also, given that this is attached to vanilla navigator, is this not expected to work in a worker scope?
Given that bootstrapping a network is a lot of work, would it make sense to have some sort of serialization/caching story here?

Nits:

The one case I saw clamp() being used seemed to implement a relu?
Search for "creatModelBuilder" in the explainer.

cynthia commented 3 years ago

One more point - feels like having a Sequential() would be nicer syntax wise.

anssiko commented 3 years ago

Thank you @cynthia and @kenchris for sharing the TAG review feedback with us.

The group will discuss this feedback on its 4 February 2021 - 15:00-16:00 UTC+0 teleconference. We dedicated most of our 1-hour meeting for this topic. You're welcome to attend subject to your availability. I apologize in advance the time is suboptimal for APAC participants.

We may create separate GH issues to track this feedback in the https://github.com/webmachinelearning/webnn/ repo and @ you to review related PRs.

Thank you again for sharing your insights, we look forward to improving and clarifying the WebNN API with your help.

anssiko commented 3 years ago

Discussed this TAG feedback on the WebML CG call today (minutes). Below are the responses and issues labeled with "tag" opened in response to the feedback. Thank you!

The fact that a GRU is in there really sticks out. I somehow found out why it is there, but it feels extremely inconsistent with the rest of the API which is fairly generic. (e.g. you should have a LSTM and a GRU, but not just a GRU - that's weird.)

This is because we're not feature complete yet, this is a "v1" API.

In the spec, some of the activations are out in global scope (e.g. relu), some are in unary operators (sigmoid, tanh) - this doesn't look consistent.

https://github.com/webmachinelearning/webnn/issues/133

The spec mentions training in the batch normalization section - but I'm fairly convinced that there is no support for training. Is this an error?

https://github.com/webmachinelearning/webnn/issues/134

getNeuralNetworkContext() and createModelBuilder() seem strange (no parameters, for one thing) - is this expected to accept parameters/configs at some point? If so, we'd like to see what is intended here.

https://github.com/webmachinelearning/webnn/issues/135

Wouldn't it make sense to have a constructor rather than a builder pattern for createModelBuilder()? (e.g. new ModelBuilder(navigator.ml.getNNContext());

https://github.com/webmachinelearning/webnn/issues/136

I see quite a few view/reshape like functions, which of these are expected to copy and which are not? Probably good to note this in the spec.

https://github.com/webmachinelearning/webnn/issues/137

If there are layers that will be taking activations as string enums, there should simply be a string enum for activations rather than have it just in RecurrentNetworkActivation. (One may argue that hyperbolic tangent is RNN specific, but...)

https://github.com/webmachinelearning/webnn/issues/138

While the limitations of JavaScript probably contribute a lot to this, but the ergonomics of this API based on example code might have room for improvement.

https://github.com/webmachinelearning/webnn/issues/139

It feels like errors/exceptions should probably fleshed out. (e.g. what happens when you try to reduce on a non-existent axis?)

https://github.com/webmachinelearning/webnn/issues/19

I don't quite understand the NamedOutput mechanism. What if what is output just a feature?

https://github.com/webmachinelearning/webnn/issues/140

A lot of the names are very generic (Operand, Compilation) - this feels like something we might want to prefix with something or synchronize with TC39 about.

https://github.com/webmachinelearning/webnn/issues/141

What's the isomorphic JS story for this? Also, given that this is attached to vanilla navigator, is this not expected to work in a worker scope?

https://github.com/webmachinelearning/webnn/issues/142

Given that bootstrapping a network is a lot of work, would it make sense to have some sort of serialization/caching story here?

A non-goal per the explainer.

Nits:

The one case I saw clamp() being used seemed to implement a relu?

Search for "creatModelBuilder" in the explainer. One more point - feels like having a Sequential() would be nicer syntax wise.

https://github.com/webmachinelearning/webnn/issues/143

We ran out of time here. The rest of the feedback to be discussed on another call.

wchao1115 commented 3 years ago

@cynthia

The spec mentions training in the batch normalization section - but I'm fairly convinced that there is no support for training. Is this an error?

The description of batch-norm in the spec is accurate but is poorly worded. I'll fix that. To clarify it here, it attempts to explain the origin of the input params mean and variance that they are the products of a training phrase of batch-normalization (which is out of scope for webnn for now). This distinction is needed to draw a contrast between batch-normalization and instance-normalization as the latter operation computes the mean and variance values per-instance on the fly within an inference call. I'll make it more clear in the text to avoid confusion.

cynthia commented 3 years ago

@wchao1115 I see your intent now. I figured that mentioning training in general would be confusing for the readers. That description makes more sense and would like to see the new text when it's there. Thanks!

anssiko commented 3 years ago

The Web Machine Learning WG (we transition from a CG into a WG during the TAG review!) has now addressed all TAG review feedback. We tracked your feedback in the Web Neural Network API GH repo issues with a "tag-tracker" label: https://github.com/webmachinelearning/webnn/issues?q=label%3Atag-tracker+is%3Aclosed

On behalf of the group, I want to thank @cynthia and the TAG for the careful review. With your feedback, the specification was substantially improved. Please do not hesitate to reach out to us with any further feedback or questions.

kenchris commented 3 years ago

Just a side-note here:

When I see code snippets like

return builder.add(
          builder.max(0, x),
          builder.mul(
            builder.constant(options.alpha), 
            builder.sub(
              builder.exp(builder.min(builder.constant(0), x)), 
              builder.constant(1))));

I am wondering if that can be made more readable when/if the pipeline operator lands in JavaScript https://github.com/tc39/proposal-pipeline-operator

It might make sense to look through examples like this as see if these fit well with pipeline operator or any change should be made

torgo commented 3 years ago

Hi @anssiko - thanks for this and for tracking this so excellently. It certainly seems the group has taken a lot of the TAG feedback onboard. Before closing, I still have a concern about multi-implementer support. Currently it doesn't seem like there is a Chrome Status entry for this API. What if any signals do you have from other implementers (e.g. is there is a Mozilla standards position)? As the group is a wg now (which is great) you'll definitely need to have multiple implementations. What's the plan for that and what's the plan for trialing this with developers?

anssiko commented 3 years ago

The WG is aware of multiple work-in-progress implementations that use independent backend implementations, building on top of existing major platform APIs, across major OSes.

Some group participants hinted we may hear more at WebML WG's TPAC meeting, including information on developer-facing trial plans.

Thank you!

cynthia commented 2 years ago

Sorry for the delay, we discussed this at length over multiple calls and while there have been some disagreements on the design principles of the API - we don't think it's critical enough to warrant an unsatisfied resolution. We're happy to see this work proceed. Thank you for bringing this to our attention.

w3ctag / design-reviews

Web Neural Network API #570