Closed lzw-lzw closed 10 months ago
Are you using the latest code? Did you set llm_type: local
to all the agents in your config?
I am using the latest code, and after setting llm_type to local, a new error appears: KeyError: 'Could not automatically map llama-2-7b-chat-hf to a tokeniser. Please use tiktoken.get_encoding
to explicitly get the tokeniser you expect.'
I think the problem should have been fixed in a previous commit. Are you using the latest code on github repo? And did you install AgentVerse
with pip install -e .
?
Thanks for your patient reply. I am using the the latest code on github repo, and I also install AgentVerse with pip install -e . .After that, I change the MODEL_PATH and MODEL_NAME to the path of my llama-2-7b-chat-hf and "llama-2-7b-chat-hf", and then I run the run_local_model_server.sh. After that, I created a directory under brainstorming, which contains a config.yaml, with llm_type:local and model: llama-2-7b-chat-hf, then I run "python3 agentverse_command/main_tasksolving_cli.py --task tasksolving/brainstorming/llama-2-7b-chat-hf", this error occurred.
Just made some updates to the code. Please check if it's working correctly now. At the moment, I don't have access to a machine with a GPU, so I'm unable to fully run the process with local LLM. If the issue persists, I'll try to find a GPU machine for further debugging.
I'm sorry it still doesn't work properly, waiting for your try. Thanks!
I can get it running locally, but the output seems incorrect. The first prompt keeps repeating, no response is generated.
Pull the latest code and try again. It works fine on my GPU machine now. After launching the FastChat service, check if it's running correctly by executing this command curl http://127.0.0.1:5000/v1/models-2-7b-chat-hf
. It should returns something like
{"object":"list","data":[{"id":"llama-2-7b-chat-hf","object":"model","created":1699856748,"owned_by":"fastchat","root":"llama-2-7b-chat-hf","parent":null,"permission":[{"id":"modelperm-7bcKCjaRGuVKoeajAXkSgP","object":"model_permission","created":1699856748,"allow_create_engine":false,"allow_sampling":true,"allow_logprobs":true,"allow_search_indices":true,"allow_view":true,"allow_fine_tuning":false,"organization":"*","group":null,"is_blocking":false}]}]}
After confirming the service is running, run the benchmark script with the following command:
python agentverse_command/benchmark.py --task tasksolving/commongen/llama-2-7b-chat-hf --dataset_path data/commongen/commongen_hard.jsonl
This should execute successfully. However, please note that while the script should run, we cannot guarantee its performance as open-sourced LLMs generally lack behind OpenAI's GPTs.
I can get it running locally, but the output seems incorrect. The first prompt keeps repeating, no response is generated.
@xymou The issue you're encountering might be due to the local model not adhering to the specific response format we've set. In the NLP classroom example, we enforce a strict format where the model's output should be structured as follows:
Action: [specific action]
Action Input: [related input]
OpenAI's GPT models usually comply with this format quite reliably. However, local LLMs might have difficulty consistently generating responses in this precise structure. Oour system is designed to automatically retry if it doesn't detect the required pattern in the model's response. This automatic retry mechanism could explain why you're noticing the prompt being repeated.
I can get it running locally, but the output seems incorrect. The first prompt keeps repeating, no response is generated.
@xymou The issue you're encountering might be due to the local model not adhering to the specific response format we've set. In the NLP classroom example, we enforce a strict format where the model's output should be structured as follows:
Action: [specific action] Action Input: [related input]
OpenAI's GPT models usually comply with this format quite reliably. However, local LLMs might have difficulty consistently generating responses in this precise structure. Oour system is designed to automatically retry if it doesn't detect the required pattern in the model's response. This automatic retry mechanism could explain why you're noticing the prompt being repeated.
Thank you for your reply! I've noticed that the open-source LLMs do not follow the instructions to generate reponses in required structure. Do you have any suggestions to solve this? e.g., giving the models an in-context example? But I guess the input length may be a restriction.😰
I can get it running locally, but the output seems incorrect. The first prompt keeps repeating, no response is generated.
@xymou The issue you're encountering might be due to the local model not adhering to the specific response format we've set. In the NLP classroom example, we enforce a strict format where the model's output should be structured as follows:
Action: [specific action] Action Input: [related input]
OpenAI's GPT models usually comply with this format quite reliably. However, local LLMs might have difficulty consistently generating responses in this precise structure. Oour system is designed to automatically retry if it doesn't detect the required pattern in the model's response. This automatic retry mechanism could explain why you're noticing the prompt being repeated.
Thank you for your reply! I've noticed that the open-source LLMs do not follow the instructions to generate reponses in required structure. Do you have any suggestions to solve this? e.g., giving the models an in-context example? But I guess the input length may be a restriction.😰
A workaround may be using the constrained generation, e.g., outlines. But we don't support it yet. You may need to investigate and make some code edition.
It's working fine now, thanks for your patient reply.
Hello, thank you for the excellent framework. When I tried to run the local model following the tutorial, I encountered the following problem:ValueError: llama-2-7b-chat-hf is not registered. Please register with the .register("llama-2-7b-chat-hf") method provided in LLMRegistry registry. What could be the reason for this? Thanks.