BioCruz-sudo commented 1 day ago

Implementation of Qwen 2.5 Coder Model in OpenHands

What problem or use case are you trying to solve?

OpenHands currently lacks support for newer, powerful open-source coding models that can run locally. The Qwen 2.5 Coder (32B) model represents a significant advancement in open-source coding assistance, offering:

Strong code completion and generation capabilities
Local deployment without API costs
Support for multiple programming languages
Reduced latency compared to cloud-based solutions
Privacy and data security through local processing

Describe the UX of the solution you'd like

Users should be able to:

Easily configure Qwen 2.5 through either:
- UI-based configuration in OpenHands settings
- Direct config.toml modification

Experience seamless integration with existing OpenHands workflows:

[llm]
provider = "ollama"
model = "ollama/qwen2.5-coder:32b"
api_base = "http://localhost:11434"

Have clear feedback on model status:
- Model loading progress
- Resource usage warnings
- Clear error messages for common issues

Technical Implementation Details

The implementation leverages existing LiteLLM integration in OpenHands:

Model Integration:

# Core components needed:
- Ollama provider configuration
- Model parameter settings
- Retry logic for local processing
- Resource monitoring

Configuration Requirements:
- Memory: 32GB RAM minimum recommended
- Storage: ~60GB for model weights
- GPU: Recommended for optimal performance

Implementation Steps:


# 1. Ollama Setup
ollama pull qwen2.5-coder:32b

2. OpenHands Configuration

Update config.toml
Add model to verified list
Implement resource checks

Error Handling:
- Resource availability checks
- Graceful fallback options
- Clear error messages for users

Alternatives Considered

Other Local Models:
- CodeLlama: Less powerful but lighter resource requirements
- DeepSeek Coder: Similar capabilities but different trade-offs
- MPT-7B-StableCode: Smaller but less capable
Alternative Implementations:
- Direct model loading without Ollama
- Pros: More control over model parameters
- Cons: More complex implementation, higher maintenance burden
- vLLM backend
- Pros: Better performance
- Cons: More complex setup, less stable with local models
Cloud-Based Solutions:
- Claude/GPT-4 Code Models
- Pros: More powerful, no local resources needed
- Cons: Cost, latency, privacy concerns

Additional Context

Performance Metrics

Initial testing shows Qwen 2.5 Coder performance:

Load time: ~30 seconds on recommended hardware
Response time: 1-3 seconds per generation
Memory usage: ~24GB RAM baseline

Resource Management

Implementation includes:

Dynamic resource checking
Batch size optimization
Memory management recommendations

Future Considerations

Model quantization support
Multi-GPU support
Performance optimizations
Caching improvements

Documentation Requirements

Will need to update:

Installation guide
Configuration documentation
Troubleshooting guide
Performance optimization guide

Testing Strategy

Unit tests for integration
Performance benchmarks
Resource usage monitoring
Error handling verification

Dependencies

Ollama runtime
LiteLLM >= 1.x
CUDA 11.8+ (for GPU support)
Python 3.8+

Timeline

Initial implementation: 1-2 days
Testing and optimization: 2-3 days
Documentation: 1 day
Review and refinement: 1-2 days

Would you like to proceed with this implementation plan?

enyst commented 1 day ago

Thank you for the proposal! If it works with Ollama, then it works with openhands, you just need to enter the name as you note, in the UI Settings or in config.toml if running in development mode.

This assumes you have pulled it in Ollama, and please make sure to use the exact name as returned by ollama list.

It might be a bit difficult to local Ollama, it depends on the machine. It should be possible to use a remote server as well. Have you successfully run it in Ollama?

BioCruz-sudo commented 1 day ago

Thank you for the proposal! If it works with Ollama, then it works with openhands, you just need to enter the name as you note, in the UI Settings or in config.toml if running in development mode.

This assumes you have pulled it in Ollama, and please make sure to use the exact name as returned by ollama list.

It might be a bit difficult to local Ollama, it depends on the machine. It should be possible to use a remote server as well. Have you successfully run it in Ollama?

Thanks for the reply, I tried a couple of different methods from pulling the qwen2.5 model into the docker container and trying by including the model in the command as such:

docker run -it --rm --pull=always `
    --network ollama-network `
    -e SANDBOX_RUNTIME_CONTAINER_IMAGE=docker.all-hands.dev/all-hands-ai/runtime:0.13-nikolaik `
    -v /var/run/docker.sock:/var/run/docker.sock `
    -p 3000:3000 `
    -e LOG_ALL_EVENTS=true `
    -e LLM_PROVIDER=ollama `
    -e LLM_MODEL="ollama/qwen2.5-coder:32b" `
    -e LLM_BASE_URL="http://ollama:11434" `
    --name openhands-app `
    docker.all-hands.dev/all-hands-ai/openhands:0.13

But it doesn't work I keep receiving errors from LiteLLM or specific agents that cannot call the API such as the below:

What's received in the Workspace/GUI for OpenHands: litellm.ServiceUnavailableError: OllamaException: HTTPConnectionPool(host='localhost', port=11434): Max retries exceeded with url: /api/generate (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f9062ee8530>: Failed to establish a new connection: [Errno 111] Connection refused'))

What shows up in the WSL terminal window:


PS C:\Users\Alan> docker run -it --pull=always `
>>     -e SANDBOX_RUNTIME_CONTAINER_IMAGE=docker.all-hands.dev/all-hands-ai/runtime:0.13-nikolaik `
>>     -v /var/run/docker.sock:/var/run/docker.sock `
>>     -p 3000:3000 `
>>     --add-host host.docker.internal:host-gateway `
>>     --name openhands-app `
>>     docker.all-hands.dev/all-hands-ai/openhands:0.13
0.13: Pulling from all-hands-ai/openhands
Digest: sha256:28307e6ef3ca477df56e0689c0fcaa8b6f073d018cf90c945317b5b76c8566cb
Status: Image is up to date for docker.all-hands.dev/all-hands-ai/openhands:0.13
Starting OpenHands...
Running OpenHands as root
INFO:     Started server process [10]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://0.0.0.0:3000 (Press CTRL+C to quit)
INFO:     172.17.0.1:55116 - "GET / HTTP/1.1" 200 OK
INFO:     172.17.0.1:55116 - "GET /assets/root-BTY3kXKc.css HTTP/1.1" 304 Not Modified
INFO:     172.17.0.1:55116 - "GET /assets/manifest-ca3f4d9e.js HTTP/1.1" 304 Not Modified
INFO:     172.17.0.1:55114 - "GET /assets/entry.client-BB5CzcwV.js HTTP/1.1" 304 Not Modified
INFO:     172.17.0.1:55116 - "GET /assets/index-DW5fD9dL.js HTTP/1.1" 304 Not Modified
INFO:     172.17.0.1:55160 - "GET /assets/module-Pmfdz5Pn.js HTTP/1.1" 304 Not Modified
INFO:     172.17.0.1:55124 - "GET /assets/jsx-runtime-zhgTtAea.js HTTP/1.1" 304 Not Modified
INFO:     172.17.0.1:55150 - "GET /assets/i18nInstance-BhwiS_NP.js HTTP/1.1" 304 Not Modified
INFO:     172.17.0.1:55134 - "GET /assets/components-Bd8d2nMd.js HTTP/1.1" 304 Not Modified
INFO:     172.17.0.1:55114 - "GET /assets/store-DY7dMTlU.js HTTP/1.1" 304 Not Modified
INFO:     172.17.0.1:55160 - "GET /assets/index-DvhLSMS6.js HTTP/1.1" 304 Not Modified
INFO:     172.17.0.1:55124 - "GET /assets/AgentState-DCyv5jZT.js HTTP/1.1" 304 Not Modified
INFO:     172.17.0.1:55134 - "GET /assets/codeSlice-IFyzp_hl.js HTTP/1.1" 304 Not Modified
INFO:     172.17.0.1:55116 - "GET /assets/router-CQIzOUUf.js HTTP/1.1" 304 Not Modified
INFO:     172.17.0.1:55150 - "GET /assets/agentSlice-0xeL04tr.js HTTP/1.1" 304 Not Modified
INFO:     172.17.0.1:55114 - "GET /assets/index-BgBamcox.js HTTP/1.1" 304 Not Modified
INFO:     172.17.0.1:55160 - "GET /assets/root-D1m7AcLM.js HTTP/1.1" 304 Not Modified
INFO:     172.17.0.1:55160 - "GET /locales/en/translation.json HTTP/1.1" 200 OK
INFO:     172.17.0.1:55160 - "GET /assets/_oh-CV4HCGEY.js HTTP/1.1" 304 Not Modified
INFO:     172.17.0.1:55114 - "GET /assets/route-BMZHP0Ck.js HTTP/1.1" 304 Not Modified
INFO:     172.17.0.1:55114 - "GET /favicon.ico HTTP/1.1" 200 OK
INFO:     172.17.0.1:55134 - "GET /assets/github-logo-DZYNM_ty.js HTTP/1.1" 304 Not Modified
INFO:     172.17.0.1:55114 - "GET /assets/connect-to-github-modal-DkkdJ7YH.js HTTP/1.1" 304 Not Modified
INFO:     172.17.0.1:55160 - "GET /assets/utils-p7Tk_-vA.js HTTP/1.1" 304 Not Modified
INFO:     172.17.0.1:55150 - "GET /assets/modal-backdrop-BPjJW9C9.js HTTP/1.1" 304 Not Modified
INFO:     172.17.0.1:55124 - "GET /assets/responses-iq082X3z.js HTTP/1.1" 304 Not Modified
INFO:     172.17.0.1:55116 - "GET /assets/open-hands-B07gLq0Q.js HTTP/1.1" 304 Not Modified
INFO:     172.17.0.1:55134 - "GET /assets/extends-CEetKkoc.js HTTP/1.1" 304 Not Modified
INFO:     172.17.0.1:55114 - "GET /assets/declaration-DblB42Xt.js HTTP/1.1" 304 Not Modified
INFO:     172.17.0.1:55150 - "GET /assets/LoadingProject-B0fD2l0z.js HTTP/1.1" 304 Not Modified
INFO:     172.17.0.1:55124 - "GET /assets/settings-DawUpo-u.js HTTP/1.1" 304 Not Modified
INFO:     172.17.0.1:55160 - "GET /assets/cache-CGyOfnUE.js HTTP/1.1" 304 Not Modified
INFO:     172.17.0.1:55160 - "GET /config.json HTTP/1.1" 200 OK
INFO:     172.17.0.1:55160 - "GET /config.json HTTP/1.1" 200 OK
INFO:     172.17.0.1:55160 - "GET /api/options/models HTTP/1.1" 200 OK
INFO:     172.17.0.1:55124 - "GET /api/options/agents HTTP/1.1" 200 OK
INFO:     172.17.0.1:55150 - "GET /api/options/security-analyzers HTTP/1.1" 200 OK
INFO:     172.17.0.1:59150 - "GET /assets/settings-BWHpKwXn.js HTTP/1.1" 304 Not Modified
INFO:     172.17.0.1:50278 - "GET /assets/route-DR3OkefA.js HTTP/1.1" 304 Not Modified
INFO:     172.17.0.1:50284 - "GET /assets/_oh.app-Cw8GyXIe.js HTTP/1.1" 304 Not Modified
INFO:     172.17.0.1:50278 - "GET /assets/index-lsr95zsB.js HTTP/1.1" 304 Not Modified
INFO:     172.17.0.1:50284 - "GET /assets/_oh.app-Ctn17uqN.js HTTP/1.1" 304 Not Modified
INFO:     172.17.0.1:50286 - "GET /assets/useScrollToBottom-B8_iP14x.js HTTP/1.1" 304 Not Modified
INFO:     172.17.0.1:50296 - "GET /assets/iconBase-rNSBfUUz.js HTTP/1.1" 304 Not Modified
INFO:     172.17.0.1:50316 - "GET /assets/index-BfUcQcX0.js HTTP/1.1" 304 Not Modified
INFO:     172.17.0.1:50300 - "GET /assets/index-CVhOlqYX.js HTTP/1.1" 304 Not Modified
INFO:     172.17.0.1:50300 - "GET /assets/Terminal-CrIF52FG.js HTTP/1.1" 304 Not Modified
INFO:     172.17.0.1:50316 - "GET /assets/Terminal-CGrrDQr5.css HTTP/1.1" 304 Not Modified
INFO:     ('172.17.0.1', 50326) - "WebSocket /ws" [accepted]
21:55:18 - openhands:INFO: listen.py:341 - New session: ed687aba-d0de-4901-8d85-509faf0ce95a
INFO:     connection open

Provider List: https://docs.litellm.ai/docs/providers

Provider List: https://docs.litellm.ai/docs/providers

21:55:18 - openhands:WARNING: codeact_agent.py:90 - Function calling not supported for model ollama/qwen2.5-coder:32b. Disabling function calling.

Provider List: https://docs.litellm.ai/docs/providers

Provider List: https://docs.litellm.ai/docs/providers

21:55:18 - openhands:INFO: eventstream_runtime.py:220 - [runtime ed687aba-d0de-4901-8d85-509faf0ce95a] Starting runtime with image: docker.all-hands.dev/all-hands-ai/runtime:0.13-nikolaik
21:55:19 - openhands:INFO: eventstream_runtime.py:224 - [runtime ed687aba-d0de-4901-8d85-509faf0ce95a] Container started: openhands-runtime-ed687aba-d0de-4901-8d85-509faf0ce95a
21:55:19 - openhands:INFO: eventstream_runtime.py:227 - [runtime ed687aba-d0de-4901-8d85-509faf0ce95a] Waiting for client to become ready at http://host.docker.internal:35808...
21:55:33 - openhands:INFO: eventstream_runtime.py:233 - [runtime ed687aba-d0de-4901-8d85-509faf0ce95a] Runtime is ready.
21:55:33 - openhands:WARNING: state.py:119 - Could not restore state from session: sessions/ed687aba-d0de-4901-8d85-509faf0ce95a/agent_state.pkl
21:55:33 - openhands:INFO: agent_controller.py:193 - [Agent Controller ed687aba-d0de-4901-8d85-509faf0ce95a] Starting step loop...
21:55:33 - openhands:INFO: agent_controller.py:316 - [Agent Controller ed687aba-d0de-4901-8d85-509faf0ce95a] Setting agent(CodeActAgent) state from AgentState.LOADING to AgentState.INIT
21:55:33 - openhands:INFO: agent_controller.py:316 - [Agent Controller ed687aba-d0de-4901-8d85-509faf0ce95a] Setting agent(CodeActAgent) state from AgentState.INIT to AgentState.RUNNING

Provider List: https://docs.litellm.ai/docs/providers

Provider List: https://docs.litellm.ai/docs/providers

21:55:33 - openhands:INFO: manager.py:31 - Conversation ed687aba-d0de-4901-8d85-509faf0ce95a connected in 0.02588510513305664 seconds
INFO:     172.17.0.1:57240 - "GET /api/list-files HTTP/1.1" 200 OK

Provider List: https://docs.litellm.ai/docs/providers

Provider List: https://docs.litellm.ai/docs/providers

21:55:33 - openhands:INFO: manager.py:31 - Conversation ed687aba-d0de-4901-8d85-509faf0ce95a connected in 0.02508068084716797 seconds
INFO:     172.17.0.1:57240 - "GET /api/list-files HTTP/1.1" 200 OK

Provider List: https://docs.litellm.ai/docs/providers

Provider List: https://docs.litellm.ai/docs/providers

21:55:33 - openhands:INFO: manager.py:31 - Conversation ed687aba-d0de-4901-8d85-509faf0ce95a connected in 0.018009185791015625 seconds
INFO:     172.17.0.1:57248 - "GET /api/list-files HTTP/1.1" 200 OK

==============
[Agent Controller ed687aba-d0de-4901-8d85-509faf0ce95a] LEVEL 0 LOCAL STEP 0 GLOBAL STEP 0

Provider List: https://docs.litellm.ai/docs/providers

Provider List: https://docs.litellm.ai/docs/providers

Provider List: https://docs.litellm.ai/docs/providers

Provider List: https://docs.litellm.ai/docs/providers

Provider List: https://docs.litellm.ai/docs/providers

Give Feedback / Get Help: https://github.com/BerriAI/litellm/issues/new
LiteLLM.Info: If you need to debug this error, use `litellm.set_verbose=True'.

Provider List: https://docs.litellm.ai/docs/providers

21:55:34 - openhands:ERROR: retry_mixin.py:47 - litellm.ServiceUnavailableError: OllamaException: HTTPConnectionPool(host='localhost', port=11434): Max retries exceeded with url: /api/generate (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fae21424f80>: Failed to establish a new connection: [Errno 111] Connection refused')). Attempt #1 | You can customize retry values in the configuration.

And some other errors I've received:


22:05:00 - openhands:ERROR: retry_mixin.py:47 - litellm.ServiceUnavailableError: OllamaException: HTTPConnectionPool(host='localhost', port=11434): Max retries exceeded with url: /api/generate (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f26fe2c6cc0>: Failed to establish a new connection: [Errno 111] Connection refused')). Attempt #1 | You can customize retry values in the configuration.
Give Feedback / Get Help: https://github.com/BerriAI/litellm/issues/new
LiteLLM.Info: If you need to debug this error, use `litellm.set_verbose=True'.
Provider List: https://docs.litellm.ai/docs/providers
22:05:15 - openhands:ERROR: retry_mixin.py:47 - litellm.ServiceUnavailableError: OllamaException: HTTPConnectionPool(host='localhost', port=11434): Max retries exceeded with url: /api/generate (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f26fe24b0e0>: Failed to establish a new connection: [Errno 111] Connection refused')). Attempt #2 | You can customize retry values in the configuration.

Hopefully we can find a workaround, I've tried implementing all recommendations and I have yet to get OpenHands functioning using the new Qwen2.5 32B model.

enyst commented 1 day ago

Could you tell, what is the result of running ollama list in the terminal?

Also, if you have other LLMs in ollama list, does it work with another?

Please make sure to enter the model, some dummy key, and the base url in the UI Settings window.

BioCruz-sudo commented 1 day ago

Could you tell, what is the result of running ollama list in the terminal?

Also, if you have other LLMs in ollama list, does it work with another?

Please make sure to enter the model, some dummy key, and the base url in the UI Settings window.

Yeah I tried running exactly as you mentioned but I don't have a dummy API key, should I try 00000000 across the board? So far I've attempted to launch using the ollama list command in terminal and have gotten to request a prompt but receive the above errors whenever I request any action.

enyst commented 20 hours ago

Yes, for openhands with ollama, you can enter anything as api key in the openhands settings window.

BioCruz-sudo commented 16 hours ago

Yes, for openhands with ollama, you can enter anything as api key in the openhands settings window.

So far these are my settings:

Model: ollama/qwen2.5-coder:32b
BASE_URL= http://ollama:11434
API KEY = 000000000

Still receive the errors above

enyst commented 9 hours ago

The base URL doesn't look right. Please see here for an example of a configuration: https://github.com/All-Hands-AI/OpenHands/issues/3960#issuecomment-2474720099

Also, you may want to take a look at this doc

BioCruz-sudo commented 2 hours ago

Thanks a bunch I got it running just fine thanks for your help!

All-Hands-AI / OpenHands

Add Qwen2.5 32B model #4952

Implementation of Qwen 2.5 Coder Model in OpenHands

What problem or use case are you trying to solve?

Describe the UX of the solution you'd like

Technical Implementation Details

2. OpenHands Configuration

Alternatives Considered

Additional Context

Performance Metrics

Resource Management

Future Considerations

Documentation Requirements

Testing Strategy

Dependencies

Timeline