Eliminate `GraphExecutionContext`

This issue proposes a simplification to the wasi-nn API: eliminate the GraphExecutionContext state object altogether and instead simply pass all tensors to and from an inference call — compute(list<tensor>) -> result<list<tensor>, ...>. This change would make set_input and get_output unnecessary and they would also be removed.

As background, the WITX IDL is the cause of GraphExecutionContext's existence. As I understood it back when wasi-nn was originally designed, WITX forced us to pass an empty "pointer + length" buffer across to the host so that the host could fill it. This led to get_output(...), which included an index parameter for retrieving tensors from multi-output models (multiple outputs is a possibility that must be handled, though not too common). Because get_output was now separate from compute, we needed some state to track the inference request — GraphExecutionContext.

Now, with WIT, we can expect the ABI to be able the host-allocate into our WebAssembly linear memory for us. This is better in two ways:

with the original "pass a buffer to get_output(...), the user had to statically know how large that buffer should be; this led to thoughts about expanding the API with some introspection (#37); if we replace GraphExecutionContext with a single compute call, this is no longer a user-facing paper cut
the API would now be simpler: if compute accepts and returns all the necessary tensors, then GraphExecutionContext, set_input, and get_output can all be removed, making the API easier to explain and use

One consideration here is ML framework compatibility: some frameworks (e.g., OpenVINO) expose an equivalent to GraphExecutionContext in their external API that must be called by implementations of wasi-nn. But, because this context object can be created inside the implementation, there is no compatibility issue. Implementations of compute will simply do a bit more than they currently do, but no more overall work than they do currently.

Another consideration is memory copying overhead: will WIT force us to copy the tensor bytes across the guest-host boundary in both directions? Tensors can be large and additional copies could be expensive. For output tensors, this may be unavoidable: when the tensor is generated on the host side during inference it must be made accessible to the Wasm guest somehow — copying is a simple solution. For input tensors, though, this discussion might suggest that there is no WIT-inherent limitation to avoid the copy. If tensor copying becomes a bottleneck, perhaps WIT resources could be the solution.

WebAssembly / wasi-nn

Eliminate `GraphExecutionContext` #43