-
### My environment setup
1st environment (running on ec2 `g6.4xlarge`)
```
[2024-06-01T10:14:23Z] Collecting environment information...
[2024-06-01T10:14:26Z] PyTorch version: 2.3.0+cu121
[2024-0…
khluu updated
3 weeks ago
-
I ran the first command provided (to do some sanity checking of my setup since I usually get very high output errors for larger models like LLMs) and I get an output validation error.
I've made sur…
-
Users are seeking assistance or guidance on how to properly set up and configure the LLM function to run on Mac systems. They may be facing difficulties in installing dependencies, configuring environ…
-
I don't know how my vscode is configured but I need to paste the dotenv part on all playbook to have my OPENAI key loaded.
```
%pip install python-dotenv
from dotenv import load_dotenv
load_dote…
-
Recently, initial Mamba support (CPU-only) has been introduced in #5328 by @compilade
In order to support running these models efficiently on the GPU, we seem to be lacking kernel implementations …
-
Hi,
I am new to ggml but what you have build is really good! Thanks a lot for that.
I was wondering if you could give me pointers about how to add a custom kernel for the GEMM/Matmul ops in the di…
-
### Your current environment
The output of `python collect_env.py`
```text
Collecting environment information...
PyTorch version: 2.4.0+cu121
Is debug build: False
CUDA used to build PyTorch…
-
### Your current environment
```text
PyTorch version: 2.3.1+cu121
Is debug build: False
CUDA used to build PyTorch: 12.1
ROCM used to build PyTorch: N/A
OS: Ubuntu 20.04.6 LTS (x86_64)
GCC ve…
-
### Your current environment
```text
# Using pip install vllm
vllm==v0.5.1
```
### 🐛 Describe the bug
```text
# My python script to test long text
def run_Mixtral():
tokenizer = AutoTok…
-
### 🐛 Describe the bug
this code is slighly modified from [async llm engine test](https://github.com/vllm-project/vllm/blob/4cf256ae7f8b0be8f06f6b85821e55d4f5bdaa13/tests/async_engine/test_async_llm_…