-
### Solution to issue cannot be found in the documentation.
- [X] I checked the documentation.
### Issue
cc @jakirkham @h-vetinari @jan-janssen @RaulPPelaez
**TODO**
* [ ] openllm 0.5.X r…
-
I would like to see some new models added to hugging chat, i will provide my reasoning for each. One is a small model(7B), the other is larger (35B).
# Starling Beta 7B
Link: [HuggingFace Model]…
-
系统:win10的wsl2——ubuntu22.04
显卡:tesla p40
在pip install -r 装好环境后,无法正常运行,排查后发现是transformers的安装版本过高导致,因此删除llm.py文件第46行的“device_map="auto"“这部分内容,顺利启动。
文字表现正常但加载图片后出现以下报错,能帮忙分析一下问题原因吗?
======>Auto …
-
### Feature request
Is it possible to rum multimodal LLMs like Qwen VL or LLaVa 1.5 using openllm?
### Motivation
_No response_
### Other
_No response_
-
Congratulations on your amazing openllm leaderboard ranking!
I'm very curious about the technical details in this work and totally agree with your _do right instruction tuning_ title. So would there …
-
Hello, I have had a trouble that quantized model with AWQ has performance degradation more than expected.
I know that ModelOpt provides optimized kernels and quantization algorithms for fast quanti…
-
## Model
- [Falcon 7B Instruct](https://huggingface.co/tiiuae/falcon-7b-instruct)
## Steps
- [x] Use [openllm](https://github.com/bentoml/OpenLLM) library to load the model
- [x] Pass the model …
-
image目录下图片成功存储
-
### Describe the bug
When I try to serve a llama 3.1 8B-4bit with openllm, it says that "This model's maximum context length is 2048 tokens".
On https://huggingface.co/meta-llama/Meta-Llama-3.1-8B,…
-
### Describe the bug
When running `openllm build` with `BENTOML_HOME=/foobar` (for example):
1. First, the model weights are downloaded to a directory under `$HOME` (in my case, under `/root` be…