Added Multi-round chat API demo

sunyuhan19981208 commented 1 year ago

This pull request adds a new API demo for a multi-round chat feature to the moss project.

The new demo allows users to engage in multiple rounds of conversation with a chatbot and provides a unique uid for each round. The uid is used to keep track of the conversation history for each round.

To use the multi-round chat feature, users can send a POST request:

## curl moss
curl -X POST "http://localhost:19324" \
     -H 'Content-Type: application/json' \
     -d '{"prompt": "你是谁？"}'

The chatbot will respond with a JSON object containing the conversation history for the current round:

{"response":"\n<|Worm|>: 你好，有什么我可以帮助你的吗？","history":[["你好","\n<|Worm|>: 你好，有什么我可以帮助你的吗？"]],"status":200,"time":"2023-04-28 09:43:41","uid":"10973cfc-85d4-4b7b-a56a-238f98689d47"}

Users can fill the uid if you want to have a multi-round chat with moss:

## curl moss multi-round
curl -X POST "http://localhost:19324" \
     -H 'Content-Type: application/json' \
     -d '{"prompt": "你是谁？", "uid":"10973cfc-85d4-4b7b-a56a-238f98689d47"}'

Changes Made Added a new moss_api_demo.py file to the the moss project. Documentation Updated the moss repository README to include a note about the new api demo.

txsun1997 commented 1 year ago

Hi @sunyuhan19981208 Many thanks for your effort and it is quite helpful. But it seems that your moss_api_demo.py is based on the old version of our moss_cli_demo.py or other similar files such as the streamlit web demo. That is basically okay but contains a few bugs due to our mistake. Please modify the moss_api_demo.py based on our latest moss_cli_demo.py, there are several changes:

The meta instruction of the sft checkpoints does not contain tool status. That's a bug due to our mistake.
Add arguments to support different model names (fnlp/moss-moon-003-sft, fnlp/moss-moon-003-sft-int8, fnlp/moss-moon-003-sft-int4) and gpus.
Deploy the model on a single gpu or multiple gpus according to the argument --gpu and add necessary checks, e.g., quantized models do not support deployment on multiple gpus.

sunyuhan19981208 commented 1 year ago

Hi @sunyuhan19981208 Many thanks for your effort and it is quite helpful. But it seems that your moss_api_demo.py is based on the old version of our moss_cli_demo.py or other similar files such as the streamlit web demo. That is basically okay but contains a few bugs due to our mistake. Please modify the moss_api_demo.py based on our latest moss_cli_demo.py, there are several changes:

The meta instruction of the sft checkpoints does not contain tool status. That's a bug due to our mistake.

Add arguments to support different model names (fnlp/moss-moon-003-sft, fnlp/moss-moon-003-sft-int8, fnlp/moss-moon-003-sft-int4) and gpus.

Deploy the model on a single gpu or multiple gpus according to the argument --gpu and add necessary checks, e.g., quantized models do not support deployment on multiple gpus.

Thank you for you reply and valuable guidance. I have made the necessary modifications to my commit based on your instructions and suggestions.

I appreciate your time and effort in reviewing my work and providing feedback to help improve the quality of the project. Your expertise and attention to detail have been invaluable in guiding me towards the best practices for contributing to the project.

Please let me know if there are any further changes that you would like me to make. I am committed to ensuring that my contributions meet the high standards of the project and am happy to make any necessary adjustments.

Once again, thank you for your support and guidance. I look forward to working with you to make this project even better.

OpenMOSS / MOSS

Added Multi-round chat API demo #249