-
### System Info
```shell
system: inf2.48xlarge instance
OS: Amazon Linux 2023
build was done with v0.0.23 but mainline has same issue
```
### Who can help?
@dacorvo
### Informatio…
-
```
$ ./run.sh $(./autotag text-generation-inference)
Namespace(packages=['text-generation-inference'], prefer=['local', 'registry', 'build'], disable=[''], user='dustynv', output='/tmp/aut…
-
### 🚀 The feature, motivation and pitch
[Writer](www.writer.com) has introduced ["Writing in the Margins" algorithm (WiM)](https://arxiv.org/html/2410.05258v1) that
boosts results for long contex…
-
### Feature request
Hello! It would be awesome to have LLaVa support (upload an image to the API and have it embed it via CLIP etc)
https://github.com/haotian-liu/LLaVA
text-generation-webui alre…
-
### Checklist
- [X] The issue exists after disabling all extensions
- [X] The issue exists on a clean installation of webui
- [ ] The issue is caused by an extension, but I believe it is caused b…
-
The most serious one is the function forward_step in megatron/text_generation_utils.py.
In inference of LLM, it needs to input all tokens before the current token or cache the K and V of those. But …
-
### System Info
```bash
gpu=0
num_gpus=1
model=meta-llama/Meta-Llama-3.1-8B-Instruct
docker run -d \
--gpus "\"device=$gpu\"" \
--shm-size 16g \
-e HUGGING_FACE_HUB_TOKEN=$token \
-p 8082:80 …
-
Hello, thanks for your great work! I have encountered several problems during the reproduction process and would like to ask for advice:
1. I tried to generate actions using my own audio and used M…
-
### System Info
`
text-generation-launcher 2.1.0
`
### Information
- [X] Docker
- [X] The CLI directly
### Tasks
- [ ] An officially supported command
- [ ] My own modifications
### Reprod…
-
# Proposed Feature
Add an efficient interface for generation probabilities on fixed prompt and completion pairs. For example:
```python
# ... load LLM or engine
prompt_completion_pairs = [
…