-
### System Info
Running a TGI 2.0.4 docker
model=microsoft/Phi-3-small-128k-instruct
docker run --gpus all --shm-size 1g -p 8080:80 -v $volume:/data ghcr.io/huggingface/text-generation-inference…
-
I wanted to use TGIS to begin to analyze and visualize paleoprecipitation. To get started, I have a set of files with July precipitation estimates every 200 years from 10000 BP to 4000 BP. This brings…
-
### Your current environment
```text
Collecting environment information...
PyTorch version: 2.3.0+cu121
Is debug build: False
CUDA used to build PyTorch: 12.1
ROCM used to build PyTorch: N/A
…
-
### Feature request
TEI has the `API_KEY` argument:
> --api-key
> Set an api key for request authorization.
>
> By default the server responds to every request. Wit…
-
This is a tracker for all the various bits we will need to track to complete the feature work to integrate and support KServe/Caikit/TGIS for FM Serving
# Requirements
Add requirements
# Individual…
-
Greetings, @cipher982!
I've seen the benchmark application https://www.llm-benchmarks.com/local and it looks great! I'm currently working on a competitive analysis of this 4 backends: Transformers…
-
It would be useful to have a way to control individual points when drawing a line. In the proposed TGI line drawing functions below, the `callback`s would be called prior to plotting each point, and i…
-
The comparison data with TGI is based on what TGI version and startup parameters, as well as hardware.
-
Only having support for ray for distributed inference will significantly reduce adoption of this tool if it truly is more performant than TGI. TGI can be run as a black-box image on Kubernetes with su…
-
I plan to implement the function calling with vision models such as LLaVA and Nous-Hermes-2-Vision-Alpha based on the image, but it seems that the current implementation in the example folder only sup…