-
Hi.
How can I use my API, which is located on a local or external server, for example?
I saw an example with the connection "POST / (GET) API CHAIN" + Chaintool.
As a result: I will not receive th…
-
### System Info
NVIDIA RTX A6000
### Who can help?
@juney-nvidia
Hi
I'm interested in using TensorRT-LLM for multiple inference inferences, but I'd like to be able to adjust the `num_be…
-
```
llm_cfg = {
# Use the model service provided by DashScope:
'model': 'qwen-vl-max-0809',
#'api_key': 'YOUR_DASHSCOPE_API_KEY',
# It will use the `DASHSCOPE_API_KEY' environment…
-
When using the agent, I encountered an issue where the prompt is structured as follows:
```
Text: My name is Jack. I come from America. I'm an associate professor working at the University of Califo…
-
### System Info
GPU: `A10`
Base Image: `FROM nvidia/cuda:12.1.0-runtime-ubuntu22.04`
Tensorrt-llm:
- `0.12.0` : It's working, but I can't use it because of a version mismatch in TRT and trt-llm-back…
-
### System Info
**OS version**: MacOS Sequoia 15.0
My *pyproject.toml*
```
[project]
name = "pandasai-benchmark"
version = "0.1.0"
description = "Add your description here"
readme = "READM…
-
### Feature Description
LiteLLM is a wrapper over LLMs from non-OpenAI providers to harmonize their APIs to OpenAI's APIs. This ensures harmonization and ease of LLM switchability.
Most LLMs (l…
-
I use the following CLI command from the README for incorporating a rate limit into my response
```
pqa --summary_llm_config '{"rate_limit": {"gpt-4o-2024-08-06": "30000 per 1 minute"}}' ask 'Are th…
jzqin updated
3 weeks ago
-
**Describe the Enhancement**
We currently unfurl input messages for spans with span_kind=LLM. We do not unfurl messages for other span_kinds, like chains and agent spans. Some of our instrumentors, i…
-
### Describe the issue
Hello, I am trying to use Autogen for this multiagent healthcare system. The code looks like this:
config_list = [
{
"model": "gpt-3.5-turbo-16k",
…