Ergonomics of the JS examples

anssiko commented 3 years ago

While the limitations of JavaScript probably contribute a lot to this, but the ergonomics of this API based on example code might have room for improvement.

via https://github.com/w3ctag/design-reviews/issues/570#issuecomment-768875996

cynthia commented 3 years ago

It's not just the examples, but the actual API itself - unless the compute graph's topology is complicated (see: Inception V3...) reading the code for a sequential model should give you a rough idea of what the model is doing, but right now this is difficult. Even for something as simple as LeNet, there is a lot of code needed. (The other part is, the ops coming from the builder really feels strange, but that's a totally separate discussion)

https://github.com/huningxin/webnn-samples/blob/master/lenet/lenet.js

In contrast, the same thing expressed in a common framework (Pytorch and tf.keras should both look roughly like this, albeit this is Pytorch) looks roughly like this (sans the weight loading), which gives you a pretty clear overview of what this network does: (I might have got some of the minor details wrong, but you get the idea.)

nn.Sequential(
        nn.Conv2d(in_channels=1, out_channels=6, kernel_size=5, stride=1),
        nn.Tanh(),
        nn.AvgPool2d(kernel_size=2),
        nn.Conv2d(in_channels=6, out_channels=16, kernel_size=5, stride=1),
        nn.Tanh(),
        nn.AvgPool2d(kernel_size=2),
        nn.Conv2d(in_channels=16, out_channels=120, kernel_size=5, stride=1),
        nn.Tanh(),
        nn.Flatten(1),
        nn.Linear(in_features=120, out_features=84),
        nn.Tanh(),
        nn.Linear(in_features=84, out_features=10),
        nn.Softmax()
    )

I guess the question I'd like to ask here "is this API for web developers or framework developers?".

huningxin commented 3 years ago

@cynthia , thanks for your feedbacks. I agree there are rooms to improve the lenet.js example code.

How do you think about the nsnet2.js example code? We also use the nsnet2 network building code as an example in the explainer.

I guess the question I'd like to ask here "is this API for web developers or framework developers?".

As the architecture diagram of the explainer shows, this API would be mainly consumed by JS ML frameworks. So it would be primarily for the framework developers. @wchao1115 @pyu10055 @anssiko , feel free to chime in.

wchao1115 commented 3 years ago

We believe most webdevs will find working at the framework level easier i.e. deploying pre-trained models of the framework-specific formats. But there is nothing preventing them from programming WebNN API directly, similar to how people using WebGL/GPU today. For that reason we give priority to inclusivity with all existing frameworks, along with flexibility and control, as WebNN should be an ideal web platform backend to any deep learning ML framework.

gramalingam commented 3 years ago

I agree that the API is more likely to be used by framework developers. Further, I think the lenet example above serves one purpose of illustrating the details of the API usage, and for this purpose it makes sense to make the details explicitly visible, to clarify the API semantics. I think it is feasible (even for developers) to add syntactic sugar as a thin layer on top of the API to allow a more declarative and compact way of describing the lenet model. The reasons for the current size are several-fold, mostly generality. Eg., the "Sequential" style will not work for cases where the output of one layer is used by multiple different subsequent layers (which must be accommodated in the general case). Further, in the inference setting, we need to use the pre-trained weights, unlike in the training setting (where declaring the shapes of weights to be trained is sufficient). These weights are implicit parameters in the Pytorch example. Making them explicit parameters is better in the general-purpose lower-level API (allowing frameworks to even mix webml with webgl/webgpu etc. in their compilation), and once they are explicit parameters, the computation does not look "linear" but more like a computation-tree.

anssiko commented 3 years ago

RESOLUTION: Explain in the spec intro the rationale why the primary API consumer is a JS framework, note Model Loader API as a higher-level attraction targeting web developers

via https://www.w3.org/2021/04/15-webmachinelearning-minutes.html#r03

anssiko commented 3 years ago

@cynthia we'll touch this issue on our 29 April call. If you have comments regarding the resolution or perhaps have further concrete suggestions on ergonomics improvements, we'd be happy to receive your comments so we can incorporate them into the PR.

anssiko commented 3 years ago

@cynthia gentle ping. We'd like to resolve this issue by our next call in two weeks and any concrete suggestions on ergonomics improvements you may have we'd like to hear so they can be incorporated in the PR.

anssiko commented 3 years ago

As discussed in WebML WG Teleconference – 2 Sep 2021 the group feels this issue has been addressed by #202.

webmachinelearning / webnn

Ergonomics of the JS examples #139