httpx.ConnectError: [Errno 111] Connection refused When trying to run on local model

GregorD1A1 commented 9 months ago

Describe the issue

There is my code:

import autogen

config_list_codellama = [
    {
        'base_url': "http://localhost:1234/v1",
        'api_key': 'NULL',
    }
]

llm_config_codellama={
    "config_list": config_list_codellama,
}

coder = autogen.AssistantAgent(
    name="Coder",
    llm_config=llm_config_codellama
)

user_proxy = autogen.UserProxyAgent(
    name="user_proxy",
    human_input_mode="NEVER",
    max_consecutive_auto_reply=10,
    is_termination_msg=lambda x: x.get("content", "").rstrip().endswith("TERMINATE"),
    code_execution_config={"work_dir": "web"},
    system_message="""Reply TERMINATE if the task has been solved at full satisfaction.
Otherwise, reply CONTINUE, or the reason why the task is not solved yet."""
)

user_proxy.initiate_chat(coder, message=
"""Write simple fastapi app.""")

Getting next error:

Server with llm don't receiving any request. Tried with LM Sudio and Ollama + LiteLLM, all same result.

Steps to reproduce

No response

Screenshots and logs

No response

Additional Information

No response

rickyloynd-microsoft commented 9 months ago

@victordibia

GregorD1A1 commented 9 months ago

@victordibia can you help?

mferris77 commented 9 months ago

I ran into the same issue. With all the API changes and abundance of examples pre- and post-migration, it's now very confusing to get set up properly. I think that we need to use the openai wrapper to connect with local LLMs. For me, I'm running LM Studio, and this is what works (note that I'm running LM Studio on another computer on my network, and I have 'CORS' mode set to 'on')

import autogen
from autogen import OpenAIWrapper
client = OpenAIWrapper(base_url="http://192.168.1.123:1234/v1", api_key="not-needed")

config_list = autogen.config_list_from_json(env_or_file="OAI_CONFIG_LIST.json", filter_dict={"model": {"LOCAL_LLM"}})

llm_config={
    "timeout": 600,
    "config_list": config_list,
    "max_tokens": -1,
    "temperature": 0,
}

client = OpenAIWrapper(base_url="http://192.168.1.129:1234/v1", api_key="not-needed")

completion = client.create(
  model="LOCAL_LLM", # this field is currently unused
  messages=[
    {"role": "system", "content": "Always answer in rhymes."},
    {"role": "user", "content": "Introduce yourself."}
  ],
  temperature=0.0,
)

# extract the response text
print(client.extract_text_or_completion_object(completion))
# print(completion)

Good luck!

(edit: I just realized I'm calling 'client' twice, that's obviously not necessary but I'll leave it for now as I don't have time to make sure it works if I remove the first instance - either way this should work for you.)

GregorD1A1 commented 9 months ago

@mferris77 Thanks for your reply! I pasted

from autogen import OpenAIWrapper
client = OpenAIWrapper(base_url="http://192.168.1.123:1234/v1", api_key="not-needed")

lines to my code, and it started to communicate with LM Studio. So there were something wrong with OpenAI wrapper probably. (By the way, framework created by Microsoft works perfectly with OpenAI models, but it is big pain to set it up with local models; interesting why).

And I'm also happy that finally I can see (on server side) what prompts exactly get's into the llm. Next step will learn how to edit them (mostly shorten to improve model attention and reduce costs), but there is totally another story.

edit: I commented back that lines and my autogen started to work. Maybe it's not a problem with autogen but with my PC, it's hard to say. Need to do more research.

microsoft / autogen