bytecodealliance / wasm-micro-runtime

WebAssembly Micro Runtime (WAMR)
Apache License 2.0
4.66k stars 576 forks source link

[wasi-nn] add more backends into `libraries/wasi-nn` #3493

Open lum1n0us opened 1 month ago

lum1n0us commented 1 month ago

Background

We are planning to add more backends into WAMR to enrich its capability in ML/LLM/AI field. Currently the only backend(in libraries/wasi-nn) is tensorflow-lite.

The code in core/iwasm/libraries/wasi-nn can be separated into two parts:

Every backend contains huge libraries. For example, libtensorflow-lite.a for the tensorflowlite backend is 651M. It is not possible to pack all backends, or even one backend, into the iwasm binary. A nature way is to create a shared library for each backend.

Proposal

There will be libwasi_nn_backend_tflite.so, libwasi_nn_backend_openvino.so, and so on. Users of iwasm need to pass the right backend w/ --native-lib, like --native-lib=libwasi_nn_backend_tflite.so or --naive-lib=libwasi_nn_backend_pytorch.so or something else, before running a wasi-nn wasm application.

Each of them contains

Choice: use vmlib.a or iwasm.so

Since wasi-nn-general(wasi_nn.c) using APIs of WAMR(like wasm_runtime_malloc(), bh_hash_map_xxx(), ...), the final libwasi_nn_backend_xx.so not only depends on ML framework libraries but also WAMR libraries, especially libvmlib.a or libiwasm.so. If realized iwasm also depends on the same library, a problem is raised.

If, iwasm and libwasi_nn_backend_xxx.so, choose static library(vmlib.a), both needs to maintain the right runtime status. Because two static libraries causes separated sets of status. For example, even though iwasm will load_and_register_native_lib() after wasm_runtime_full_init(), wasm_runtime_malloc() in libwasi_nn_backend_xxx.so will not work unless calling wasm_runtime_full_init() by itself. The reason is there are two global variables memory_mode in space for two components.

A quick and simple solution is let iwasm and libwasi_nn_backend_xxx.so use the dynamic library(libiwasm.so). But it may change the distribution form of iwasm. So, we have to stick at a static library.

The reason of not using another way is it requires a lot effort to resolve incoming problems, like:

## Choice: cross platform APIs

Another way to sync runtime status in iwasm and libwasi_nn_backend_xxx is avoiding two status at the beginning. It means to left wasi_nn.c in iwasm. Only pack ML framework(tensorflow-lite.a) and integration code(wasi_nn_tensorflowlite.cpp) together. Plus, need to choose WAMR APIs used in integration code wisely. Majority of those WAMR APIs are about cross platforms. If consider the very limited number of supported platforms of ML frameworks, maybe we don't need to involve any WAMR APIs at all. Handle it by creating a separate cross-platform layer for backend or using POSIX for now

## Choice: work with libwasi_nn_backend_xxx.so

If not contains wasi_nn.c in, the wasi-nn backend can't work via iwasm --native-lib=libwasi_nn_backend_xxx.so. It turns to a "dlopen()/dlsym()" case.

Choice: distribution of iwasm and iwasm-nn

Usually, the release of iwasm is just one executable binary. iwasm. Since the wasi-nn backends plan requires dynamic libraries, iwasm-nn becomes iwasm-nn + libiwasm.so + libwasi_nn_backends_xxx.so(multiple libraries for multiple backends).

By default,