Question about the API: why is execution-target in load()?

Finfalter commented 2 years ago

I wonder why execution-target is part of the function signature of load() where for my use case I miss it in init-execution-context():

// Load an opaque sequence of bytes to use for inference.
load: func(builder: graph-builder-array, encoding: graph-encoding, target: execution-target) -> expected<graph, error>

// Create an execution instance of a loaded graph.
init-execution-context: func(graph: graph) -> expected<graph-execution-context, error>

Rationale In the wit you say that "A graph is a load-ed instance of a specific ML model (e.g., MobileNet) for a specific ML framework". I agree to that and I agree that, for example, graph-encoding is an important parameter for load() because it ensures the compatibility of graph with the ML framework.

However, parameter execution-target in my mind rather helps to determine the execution context. I think, even though graph may be specific to execution-target, one just has to make sure that execution-target is correctly configured. The execution-target is configured in init-execution-context(), right?

Why is execution-target not part of init-execution-context() signature?

abrown commented 2 years ago

I wonder why execution-target is part of the function signature of load()

Some frameworks need the target very early on because they optimize (i.e., compile) the model for that device. Take for instance OpenVINO, which happens to need that very information for making the model executable. Other frameworks (e.g., TF, I think) can receive this same information much later, almost at inference time.

Why is execution-target not part of init-execution-context() signature?

I think I am open to migrating this but first a little bit of "context:" the initial design of wasi-nn attempted to minimize as much as possible any differences between it and WebNN (besides the largest difference of course--"loader API" vs "builder API"). Where WebNN built up an MLGraph, wasi-nn would load a graph; then both frameworks would allow the creation of a context which could compute inference requests. I may have made a mistake (or WebNN may have been slightly different at the time) but we can see now that indeed it is WebNN's MLContext that understands about different execution targets. I think this is a good argument for migrating the API towards what you suggest.

So one option would be to move execution-target to init-execution-context; to figure this out we would need to investigate if the backends support this information at a later time. Another option @mingqiusun and I considered was to merge load and init-execution-context into one function (thoughts?); looking at WebNN more, however, I feel like sticking with the two-step process (load/build -> context) that we currently have is probably a good idea.

Finfalter commented 2 years ago

Thanks a lot for your input. I must admit that currently, I am rather looking at this topic from a "Tensorflow (Lite) point of view". I was not aware that other frameworks need the target earlier.

Merging load and init-execution-context crossed my mind as well. As a consequence, the design decision where to put execution-target was superfluous - a good thing. Also, working with Tensorflow (Lite) and Tract I see that init-execution-context depends on the model anyway. Since there seems to be a rather tight coupling then between load and init-execution-context, one might want to unify these two as a natural consequence.

Maybe I will come back to this with a more profound opinion once I had a deeper look into OpenVino.

WebAssembly / wasi-nn

Question about the API: why is execution-target in load()? #31