-
### Priority
P1-Stopper
### OS type
Ubuntu
### Hardware type
Xeon-GNR
### Installation method
- [X] Pull docker images from hub.docker.com
- [ ] Build docker images from source
…
-
**Describe the bug**
When changing the mapset inside a Python script, the temporal framework does not take this change into account and fail to connect to the temporal database, even thought GRASS re…
-
Hi - really interesting work. We're currently using HF TGI in production and exploring using this instead, are there plans to add things like typical_p that transformers supports? Would greatly ease t…
-
### System Info
GPU: RTX4090
Run 2.1.0 with docker like:
`docker run -it --rm --gpus all --ipc=host -p 8080:80 -v /home/jp/.cache/data:/data ghcr.io/huggingface/text-generation-inference:2.1.0 …
jphme updated
1 month ago
-
Do you have streaming functionality for auto-regressive LLMs? Something similar to Huggingface TGI for example.
-
Someone approached me on IRC and asked about how to use text output in TGI. Should be easy enough - i thought. Until i tried to add some simple text output to the existing TGI sample. Apparently not o…
-
-
Greetings, @cipher982!
Currently we are working on the Openvino inference framework, and such benchmarks are critical to understand gaps and differences between our framework and Transformers/ TGI …
-
- [ ] OpenAI
- [ ] Anthropic
- [ ] Groq
- [ ] Cohere
- [ ] Llama some how (ollama & groq are fine)
-
### Feature request
Support the recent larger embedding models of 7B or more parameters (20x larger than BERT-large)
### Motivation
The embedding models are being much larger than before in the pas…
ai-jz updated
6 months ago