Closed MAhmadUzair closed 4 months ago
Hi!
SWE-agent is complex and will probably not run well on small, local LMs. For now, rather than breaking our heads trying to figure how to make this work, we're going to wait a few months to let local LMs get better before we try to make SWE-agent work with them.
Closing this for now, as we will prioritize other topics, but if this is something you care about then we think it's a cool topic to work on and you should continue. We're sure someone will get this to work at some point.
Describe the bug
I am trying to run the swe agent on local llm without Open ai key, For this I have modified the OpenAI Class to access the gradio url. In Gradio I have hosted a model "Code Qwen 0.5B".
Below is the modifed Open AI Class to integrate the gradio.
`
class OpenAIModel(BaseModel): MODELS = { "gpt-3.5-turbo-0125": { "max_context": 16_385, "cost_per_input_token": 5e-07, "cost_per_output_token": 1.5e-06, }, "gpt-3.5-turbo-1106": { "max_context": 16_385, "cost_per_input_token": 1.5e-06, "cost_per_output_token": 2e-06, }, "gpt-3.5-turbo-16k-0613": { "max_context": 16_385, "cost_per_input_token": 1.5e-06, "cost_per_output_token": 2e-06, }, "gpt-4-32k-0613": { "max_context": 32_768, "cost_per_input_token": 6e-05, "cost_per_output_token": 0.00012, }, "gpt-4-0613": { "max_context": 8_192, "cost_per_input_token": 3e-05, "cost_per_output_token": 6e-05, }, "gpt-4-1106-preview": { "max_context": 128_000, "cost_per_input_token": 1e-05, "cost_per_output_token": 3e-05, }, "gpt-4-0125-preview": { "max_context": 128_000, "cost_per_input_token": 1e-05, "cost_per_output_token": 3e-05, }, "gpt-4-turbo-2024-04-09": { "max_context": 128_000, "cost_per_input_token": 1e-05, "cost_per_output_token": 3e-05, }, }
`
And Below is the Hosted Model Gradio Code.
`from transformers import AutoModelForCausalLM, AutoTokenizer device = "cuda" # the device to load the model onto
model = AutoModelForCausalLM.from_pretrained( "Qwen/Qwen1.5-0.5B-Chat", torch_dtype="auto", device_map="auto" ) tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen1.5-0.5B-Chat")
prompt = "Give me a short introduction to large language model." messages = [ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": prompt} ] text = tokenizer.apply_chat_template( messages, tokenize=False, add_generation_prompt=True ) model_inputs = tokenizer([text], return_tensors="pt").to(device)
generated_ids = model.generate( model_inputs.input_ids, max_new_tokens=512 ) generated_ids = [ output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids) ]
response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
from transformers import AutoModelForCausalLM, AutoTokenizer from gradio import Interface import torch
Load the Qwen model (assuming you have transformers installed)
device = "cuda" if torch.cuda.is_available() else "cpu" # Use GPU if available model = AutoModelForCausalLM.from_pretrained( "Qwen/Qwen1.5-0.5B-Chat", torch_dtype="auto", device_map="auto" ) tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen1.5-0.5B-Chat")
def generate_introduction(prompt): """ Generates an introduction to large language models using the Qwen model, formatted for a specific system.
Define the Gradio interface
interface = Interface( fn=generate_introduction, inputs="text", outputs="text", title="Introduction to Large Language Models", description="Enter a prompt to get a short introduction to large language models generated by the Qwen model." )
Launch the Gradio interface
interface.launch()
`
After pasting gradio url into models.py openAI class I run the project.
But When I run below command to start the project.
python3.9 run.py --model_name gpt4 --data_path /home/uzair/project/test-repo/problem_statements/1.md --repo_path /home/uzair/project/test-repo --config_file config/default_from_url.yaml --apply_patch_locally
It starts and agent copies the local issue from local repo, after copying it shows some format error given below:
` WARNING FORMAT ERROR Your output was not formatted correctly. You must always include one discussion and one command as part of your response. Make sure you do not have multiple discussion/command tags. Please make sure your output precisely matches the following format: DISCUSSION Discuss here with yourself about what your planning and what you're going to do in this step.
WARNING Malformat limit reached: Failed to get response: {"detail":[{"type":"missing","loc":["body","data"],"msg":"Field required","input":{"prompt":"[{\"role\": \"system\", \"content\": \"SETTING: You are an autonomous programmer, `
Questions
1) Can I run swe agent with local llm? 2) Can you help in resolving error? 3) Is it possible to acheive what I am trying to do?
Steps/commands/code to Reproduce
python3.9 run.py --model_name gpt4 --data_path /home/uzair/project/test-repo/problem_statements/1.md --repo_path /home/uzair/project/test-repo --config_file config/default_from_url.yaml --apply_patch_locally
Error message/results
WARNING FORMAT ERROR Your output was not formatted correctly. You must always include one discussion and one command as part of your response. Make sure you do not have multiple discussion/command tags. Please make sure your output precisely matches the following format: DISCUSSION Discuss here with yourself about what your planning and what you're going to do in this step.
WARNING Malformat limit reached: Failed to get response: {"detail":[{"type":"missing","loc":["body","data"],"msg":"Field required","input":{"prompt":"[{\"role\": \"system\", \"content\": \"SETTING: You are an autonomous programmer,
System Information
Windows Core i7 8 gen
Checklist