VincyZhang / intel-extension-for-transformers

Extending Hugging Face transformers APIs for Transformer-based models and improve the productivity of inference deployment. With extremely compressed models, the toolkit can greatly improve the inference efficiency on Intel platforms.
Apache License 2.0
0 stars 0 forks source link

system prompt can't be assigned via neuralchat frontend #16

Open VincyZhang opened 4 months ago

VincyZhang commented 4 months ago

neuralchat already synced RESTful API with latest OpenAI protocol via 2e1c79d9b99db8bc004d67235fc6df51ca1d238e But neuralchat frontend don't have field to assign system prompt.

backend log

INFO:     127.0.0.1:58004 - "POST /v1/chat/completions HTTP/1.1" 200 OK
2024-02-23 04:14:29,602 - root - INFO - [INST] hi [/INST]
Batches: 100%|██████████| 1/1 [00:00<00:00, 86.64it/s]
2024-02-23 04:14:29,620 - root - INFO - Chat with QA Agent.

frontend log

2024-02-23 04:14:29 | INFO | gradio_web_server | ==== request ====
{'model': '/root/intel-extension-for-transformers/intel_extension_for_transformers/neural_chat/examples/deployment/textbot/backend/Llama-2-7b-chat-hf/', 'messages': [{'role': 'user', 'content': 'hi'}], 'temperature': 0.001, 'top_p': 0.95, 'max_tokens': 512, 'stream': True}
2024-02-23 04:14:32 | INFO | httpx | HTTP Request: POST http://127.0.0.1:8080/v1/chat/completions "HTTP/1.1 200 OK"
image
NeuralChatBot commented 4 months ago

In the given logs, it seems that the RESTful API is successfully synced with the latest OpenAI protocol and the backend is able to communicate with the frontend. However, there is no mention of a field to assign a system prompt in the frontend. It might be a feature that needs to be added or configured in the frontend code. If you need help, please @NeuralChatBot

VincyZhang commented 4 months ago

System prompt can be assigned via neuralchat backend now. The frontend feature will be completed by @WenjiaoYue later because of schedule of tasks. @NeuralChatBot

NeuralChatBot commented 4 months ago

To assign a system prompt in the backend, you can follow these steps:

  1. Open the file "backend/Llama-2-7b-chat-hf/backend.py" in your text editor.
  2. Locate the function "generate_response" within the "ChatWithQA" class.
  3. Add the desired system prompt as a new key-value pair in the "messages" dictionary. For example, you can add a new message like this: messages.append({'role': 'system', 'content': 'This is the system prompt'})
  4. Save the changes to the file.
  5. Restart the backend server to apply the changes.

Once the backend is updated, the frontend will need to be updated to display and use the assigned system prompt.