-
System Info
GPU: NVIDIA RTX 4090
TensorRT-LLM 0.13
quest 1: How can I use the OpenAPI to perform inference on a TensorRT engine model?
root@docker-desktop:/llm/tensorrt-llm-0.13.0/examples/apps# pyt…
-
Description:
As a developer working on my project, one of the main challenges I’ve encountered is the limitations of using external language models, especially when I reach usage limits or encounte…
-
## Description
When using plugin with an LLM model running on Ollama server hosted locally (e.g., on another server within the same local network), the plugin successfully connects to the Ollama AP…
-
The Tile prompter currently links to Huggingface.
It would be better to give users the customizability options, and capability, of local VLM & LLM models.
-
**Enhancement / feature description:**
MindForger with local LLM support.
-
### What happened?
Using config
```yaml
model_list:
- model_name: bge-large-en-v1.5
litellm_params:
model: huggingface/BAAI/bge-large-en-v1.5
api_base: http://localhost:80…
-
### Describe the bug
When starting the project via Docker and having filled in the .env.local file with API keys and URLs of local LLM systems, the environment variables are not being picked up. Th…
-
tasks:
* add more API Keys to frontend secrets file (local & on streamlit community cloud for deployment)
* replace some google-generative-ai sdk code with litellm to simplify code for switching be…
-
I am very exited by the idea of text to gql and would love to implement it for my organization. The context to send a long is pretty big though. This I’d love the option to use a local llama instance …
-
Hi,
I've tried creating an agent using an openAI Assistant as the LLM. It joins the room and works as expected until after the it's first utterance. After speaking the string I pass into the agent.…