All-Hands-AI / OpenHands

🙌 OpenHands: Code Less, Make More
https://all-hands.dev
MIT License
31.26k stars 3.61k forks source link

Enhance LLM Usage by Separating Local vs. API-based Implementations and Reducing Installed Package Dependencies #794

Open dorbanianas opened 5 months ago

dorbanianas commented 5 months ago

Summary This enhancement aims to separate the usage of local large language models (LLMs) from LLMs accessed through APIs, and reduce the dependency on installed packages like NVIDIA and PyTorch. This will provide more flexibility and efficiency in deploying and using OpenDevin.

Motivation

Technical Design

  1. Provide a mechanism to integrate and use local LLMs without installed packages.
  2. Develop a standardized interface for integrating API-based LLMs.
  3. Minimize dependency on installed packages like NVIDIA libraries and PyTorch.
  4. Ensure flexible configuration and deployment for local and API-based LLM usage.

Additional context This enhancement aligns with the goal of making OpenDevin applications more accessible, deployable, and maintainable by reducing dependencies and providing flexible integration options.

JayQuimby commented 5 months ago

I'm not sure about other local implementations, but Ollama local implementation can be run on an external conda env so would not be a problem removing Nvidia and pytorch.

I know some other people are using LM studio and obabooga, not sure how they are being setup/run.

kroggen commented 5 months ago

Also an option for poetry to not store a cache. It is taking 12GB on my case!

github-actions[bot] commented 4 months ago

This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.

touhidurrr commented 4 months ago

Hi! Any update regarding the issue. I would like give OpenDevin a try again if this is fixed.

SmartManoj commented 2 months ago

@touhidurrr Could you remove this line and check? https://github.com/OpenDevin/OpenDevin/blob/1d4f422638fb2ac6a3f698168db88abb50056f10/pyproject.toml#L44

$ poetry run pip show torch
Name: torch
Version: 2.2.2
Summary: Tensors and Dynamic neural networks in Python with strong GPU acceleration
Home-page: https://pytorch.org/
Author: PyTorch Team
Author-email: packages@pytorch.org
License: BSD-3
Requires: filelock, fsspec, jinja2, networkx, nvidia-cublas-cu12, nvidia-cuda-cupti-cu12, nvidia-cuda-nvrtc-cu12, nvidia-cuda-runtime-cu12, nvidia-cudnn-cu12, nvidia-cufft-cu12, nvidia-curand-cu12, nvidia-cusolver-cu12, nvidia-cusparse-cu12, nvidia-nccl-cu12, nvidia-nvtx-cu12, sympy, triton, typing-extensions
Required-by: sentence-transformers

$ poetry run pip show sentence-transformers
Name: sentence-transformers
Version: 3.0.1
Summary: Multilingual text embeddings
Home-page: https://www.SBERT.net
Author: Nils Reimers, Tom Aarsen
Author-email: info@nils-reimers.de
License: Apache License 2.0
Requires: huggingface-hub, numpy, Pillow, scikit-learn, scipy, torch, tqdm, transformers
Required-by: llama-index-embeddings-huggingface

$ poetry run pip show llama-index-embeddings-huggingface    
Name: llama-index-embeddings-huggingface
Version: 0.2.2
Summary: llama-index embeddings huggingface integration
Home-page:
Author: Your Name
Author-email: you@example.com
License: MIT
Requires: huggingface-hub, llama-index-core, sentence-transformers
Required-by:

@enyst Will HuggingFaceEmbedding used by CodeActAgent?

charliez0 commented 1 month ago

interesting problem BUT even when running locally it's another docker container doing LLM tasks, so this should never be used

Also, to use nvidia cards, cuda environment should be installed too, but this base image doesn't, so ..........

charliez0 commented 1 month ago

according to the docs given, to run with local LLMs https://docs.all-hands.dev/modules/usage/llms/localLLMs

--add-host host.docker.internal:host-gateway \
-e LLM_API_KEY="ollama" \
-e LLM_BASE_URL="http://host.docker.internal:11434" \
-e LLM_OLLAMA_BASE_URL="http://host.docker.internal:11434" \

it's using http API generated by another docker container