Update LLM inference docs

Hello, I am a code review bot on flows.network. Here are my reviews of code commits in this PR.

Overall Summary:

This pull request consists of multiple patches that update the documentation for running the LLM inference application. The changes mainly involve updating commands, adding example prompts/responses, and providing optional CLI options for more control over model execution.

Potential issues and errors in the pull request include:

Lack of specific bug fixes or feature enhancements in the patches, indicating that they are primarily documentation updates.
Insufficient explanation of the changes made, making it difficult to understand the reasoning behind the updates.
Missing context information regarding command changes and their impact on model performance or behavior.
Lack of comprehensive examples and scenarios to test the models' performance and accuracy.
No addressing of known issues or bugs related to the models or inference process.
Possible errors introduced by changes in URLs or command paths.
Lack of clarity on whether there are additional changes or potential issues not mentioned in the summaries.

The most important findings are:

The need for clearer explanations and justifications for the changes made.
The importance of providing context for command changes and their effects on model performance.
The necessity of comprehensive examples to test model accuracy and performance.
Identification and resolution of errors introduced by changes in URLs or command paths.

It is recommended that the pull request author address the potential problems and findings mentioned above to provide a more comprehensive and informative update to the LLM inference documentation.

Details

Commit 4903dc372a7bfdd769f5f511e4df95b33725d564

Key changes:

Updated the command for running the inference application in WasmEdge.
Added example input prompts and corresponding model responses.
Added optional CLI options for more control over the model execution.

Potential problems:

The patch does not include any specific bug fixes or feature enhancements. It seems to be a documentation update.
There are no code changes in this patch, only changes to the documentation file.

Commit 8b5d011db69303ed75a4b29c189343e116a83384

Key changes:

Updated the command to run the llama-2-7b model and provided a new prompt template.
Added new example conversations for the llama-2-7b model.
Updated the command to run the llama-2-13b model and provided a new prompt template.
Added a new example question for the llama-2-13b model.

Potential problems:

The patch does not include any explanation for the changes made, which makes it difficult to understand the reasoning behind the updates.
It would be helpful to have more context information regarding the command changes and how they affect the performance or behavior of the models.
It's unclear if the new prompt template is properly implemented and functioning as expected.
The examples provided in the patch are not comprehensive and could benefit from additional scenarios to test the models' performance and accuracy.
The patch does not address any known issues or bugs related to the llama-2 models or the inference process.

Commit c39905ef75a22900aeb24e1462756acf66a9a0f3

Key changes in the pull request:

Updated the Quick Start section with correct commands for cloning the llama-utils repo and getting the chat model
Removed unnecessary prompt strings from the example output
Updated the Options section with a table of available prompt templates
Updated the example command that prints out logs and statistics of the model at runtime
Added a new section on improving performance by AOT compiling the wasm file

Potential problems:

It seems that the cd command in the Quick Start section is incorrect. It should be removed or updated to the correct directory.
The paths for downloading the chat models in the Quick Start section are incorrect. They should be fixed to use the correct URLs.
The Options section could use some additional explanation on how to use the --prompt-template option and its available options.
The new section on improving performance could benefit from some additional explanation on the benefits and drawbacks of AOT compiling the wasm file.

Commit 445230837ddee15f9dae76a44abaf3515b7e87ed

Key Changes:

Updated the URL in the curl command to download the model file.

Potential Problems:

The change in the URL may cause the curl command to fail if the new URL is invalid or the file does not exist.
It is not clear if any changes were made to the code or instructions after the curl command. The summary only mentions the change in the URL, but there may be other changes that were not included in the summary. Reviewers may need to further examine the code and instructions to identify any additional changes or potential problems.

Commit 6036292bce797410fa4ba3bbe31a202cf9c968fb

Key Changes:

Updated the name of the model from llama-2-13b to llama-2-7b in multiple places.
Removed the command to download the model file llama-2-13b-chat-q5_k_m.gguf.
Updated the command to load the model in the wasmedge command.
Updated the command to compile the llama-chat.wasm file.
Fixed a typo in the code comments.

Potential Problems:

The patch includes changes to the llm_inference.md documentation file, but it does not mention any specific reason or purpose for these changes. It would be beneficial to have a more detailed commit message or pull request description explaining the motivation behind these updates.
It seems that the command to download the model file has been removed from the instructions. This might cause confusion for users trying to follow the steps.
The change from llama-2-13b to llama-2-7b in multiple places raises a question about the compatibility and correctness of the instructions. It should be clarified whether this change is intentional and if it aligns with the version of the model being used.
The typo in the code comments has been fixed, but it would be helpful to explicitly mention the typo that has been corrected for clarity.

Overall, the changes seem relatively small and mostly involve updating the model name and corresponding commands. However, some clarifications and explanations are needed to ensure the correctness and understanding of the updated instructions.

WasmEdge / docs