-
code snippet:
```
from vllm import LLM, SamplingParams
from time import time
# Sample prompts.
prompts = [
"Hello, my name is",
"The president of the United States is",
"The capi…
-
### System Info
Pre-built Docker image on g4dn.xlarge with Deep Learning OSS Nvidia Driver AMI GPU PyTorch 2.2.0 (Amazon Linux 2)
### Information
- [X] Docker
- [ ] The CLI directly
### T…
-
It will be great if baml works with some batch inference tools like vLLM or add its own
-
### Checked other resources
- [X] I added a very descriptive title to this issue.
- [X] I searched the LangChain documentation with the integrated search.
- [X] I used the GitHub search to find a sim…
-
**Describe the bug**
The request from iLab seems cut off at Task 3 of the input prompt. I setup a proxy service which accepts iLab client request and could therefore print out the request message r…
-
please solve the problem in code
import torch
import uvicorn
import gc
import asyncio
import argparse
import io
from fastapi import FastAPI, WebSocket, Depends
from fastapi.responses …
-
### Your current environment
Collecting environment information...
PyTorch version: N/A
Is debug build: N/A
CUDA used to build PyTorch: N/A
ROCM used to build PyTorch: N/A
OS: Debian GNU/Lin…
-
- [x] I have read and agree to the [contributing guidelines](https://github.com/griptape-ai/griptape#contributing).
**Is your feature request related to a problem? Please describe.**
A customer fr…
-
Context
**What are you trying to do and how would you want to do it differently? Is it something you currently you cannot do? Is this related to an issue/problem?**
**Answer:** Trying to model t…
-
**Describe the bug**
The on_chat_resume function is unable to retrieve the metadata of messages when using the thread's steps.
**To Reproduce**
Start a chat session and send messages.
Resume the…