second-state / WasmEdge-WASINN-examples

Apache License 2.0
217 stars 36 forks source link

[WASI-NN] Add single token inference #75

Closed dm4 closed 5 months ago

dm4 commented 5 months ago

This pull request is for adding the function of single token inference. To implement this function, we need to make modifications to 3 different repositories:

  1. wasmedge-wasi-nn: Add two host functions compute_single and get_output_single
  2. WasmEdge: Implement the two WASI-NN functions compute_single and get_output_single
  3. WasmEdge-WASINN-examples: Add examples for single token inference
juntao commented 5 months ago

Hello, I am a code review bot on flows.network. Here are my reviews of code commits in this PR.


Overall summary:

This GitHub Pull Request titled "[WASI-NN] Add single token inference" introduces support for single token inference in the main.rs file and includes updates to the README.md file in the wasmedge-ggml-llama-interactive directory. The key changes in this patch include the addition of code to perform inference one token at a time and retrieve the output of each token, as well as the addition of a new flag try_single_token_inference to control this behavior.

In terms of the README.md update, a new section on Token Usage has been added. It provides instructions on using get_output() to retrieve the token usage of input and output text, and describes the format of the token usage data in JSON. Users are also advised to consider the context size and number of tokens used to avoid exceeding the limit.

There don't appear to be any potential problems or errors with this patch. It seems to be a straightforward implementation and documentation update.

Details

Commit 837af3a6155c8be1c06be6d7521c3035da201df0

Key changes:

Commit 05e07867b2263af942fbe1b3187c03c32d51502d

Key changes in the patch:

Potential problems: