Open xujiangyu opened 9 months ago
Hi @xujiangyu ! If you are referring to the examples/server application, you can access it by entering the server address (e.g., 127.0.0.1:8080) in your browser. This allows you to interact with the model via a simple UI and see the outputs. For more details, please refer to the server documentation. Additionally, all inference outputs from the server are also printed to stdout.
For other applications, most of them print the inference results in the command line. You can find usage instructions in the examples/[application]
directory, where each application's README and source code are available.
Hi @xujiangyu ! If you are referring to the examples/server application, you can access it by entering the server address (e.g., 127.0.0.1:8080) in your browser. This allows you to interact with the model via a simple UI and see the outputs. For more details, please refer to the server documentation. Additionally, all inference outputs from the server are also printed to stdout.
For other applications, most of them print the inference results in the command line. You can find usage instructions in the
examples/[application]
directory, where each application's README and source code are available.
Thank you for your reply. I wonder how to add background knowledge in the parameters ,such as for RAG flow. I check the parameters of the main func and didn't recognise such a specific parameter.
Adding background knowledge is quite an application layer concept and is no more than injecting information in prompts. This project focuses on the LLM inference and doesn't provide convenient support for that.
I suggest using some wrappers like the llama-cpp-python
library (you can use our forked version here), or the server endpoint. And then you can use any mainstream orchestration frameworks like LangChain to easily achieve the RAG workflow.
Prerequisites
Before submitting your question, please ensure the following:
Question Details
Please provide a clear and concise description of your question. If applicable, include steps to reproduce the issue or behaviors you've observed.
Additional Context
Please provide any additional information that may be relevant to your question, such as specific system configurations, environment details, or any other context that could be helpful in addressing your inquiry.