huggingface text-generation-inference issues

huggingface / text-generation-inference

Large Language Model Text Generation Inference

http://hf.co/docs/text-generation-inference

Apache License 2.0

7.93k stars 859 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

Add router name to /info endpoint

#1854 Wauplin opened 1 hour ago
0
404 for Multi-modal docs

#1853 RonanKMcGovern opened 2 hours ago
0
Serverless inference API endpoints fails to return logprobs via chat completions

#1852 ggbetz opened 19 hours ago
0
Upgrading to rust 1.78.

#1851 Narsil opened 20 hours ago
0
[WIP] MLPSpeculator speculative decoding support

#1850 JRosenkranz opened 20 hours ago
1
Updating Phi3 (long context).

#1849 Narsil closed 18 hours ago
0
Remove misleading warning (not that important nowadays anyway).

#1848 Narsil closed 22 hours ago
0
UserWarning: You are using a Backend <class 'text_generation_server.utils.dist.FakeGroup'> as a ProcessGroup. This usage is deprecated since PyTorch 2.0

#1847 fxmarty opened 23 hours ago
0
Do I need to additionally apply an inference template?

#1846 Semihal opened 1 day ago
0
feat: improve message content chunks handling

#1845 drbh opened 1 day ago
0
feat: prefer huggingface_hub in docs and show image api

#1844 drbh closed 20 hours ago
1
Fix: "Fixing" double BOS for mistral too.

#1843 Narsil closed 1 day ago
0
Unable to stop TGI after serving models

#1842 ponshane opened 1 day ago
0
Adding scripts to prepare load data.

#1841 Narsil closed 1 day ago
0
Dummy PR (secrets/credentials ?)

#1840 Narsil closed 2 days ago
0
Version 2.0.2

#1839 Narsil closed 2 days ago
0
Failing to start a TGI pod with 2 or more GPUs. Sharding fails.

#1838 jayteaftw opened 2 days ago
0
Canno launch with error exllamav2_kernels not installed.

#1837 coderaBruce opened 2 days ago
2
fix: split docs and start conceptual page

#1836 drbh closed 2 days ago
1
feat: move allocation logic to rust

#1835 OlivierDehaene opened 2 days ago
0
TGI crashes with complex json schemas provided as grammar without any information (on debug/trace level)

#1834 o1iv3r opened 2 days ago
1
(chore): torch 2.3.0

#1833 Narsil closed 2 days ago
0
Enable testing TGI on XPU

#1832 mfuntowicz opened 2 days ago
0
Out of Memory Errors When Running text-generation-benchmark Despite Compliant Batch Token Limit

#1831 martinigoyanes opened 3 days ago
2
Martinigoyanes fix frequency penalty

#1830 drbh closed 3 days ago
1
Bugfix/add tools prompt

#1829 drbh closed 3 days ago
2
Handle images in chat api

#1828 drbh closed 3 days ago
1
Better graceful shutdown.

#1827 Narsil closed 3 days ago
0
Process hangs in local run

#1826 Hojun-Son opened 3 days ago
1
Add the missing `tool_prompt` parameter to Python client

#1825 maziyarpanahi closed 3 days ago
1
The TGI loading model consumes all available gpus memory

#1824 IdleIdiot opened 5 days ago
0
Python client: Extra slash in base_uri leads to failures in chat endpoint

#1823 kcarnold opened 5 days ago
0
Support for ReFT

#1822 RonanKMcGovern opened 5 days ago
0
Suport for InternVL-Chat-V1-5

#1821 Iven2132 opened 6 days ago
1
Changing the waiting_served_ratio default (stack more aggressively by default).

#1820 Narsil closed 4 days ago
1
Planned/Potential of significant work

#1819 Narsil opened 6 days ago
0
Fixing qwen2.

#1818 Narsil closed 6 days ago
0
Dummy CI run.

#1817 Narsil closed 6 days ago
0
2nd round of benchmark modifications (tiny adjustements to avoid overloading the host).

#1816 Narsil closed 6 days ago
0
Blunder

#1815 Narsil closed 6 days ago
0
Shared volume using mountpoint-s3, permissions issues

#1814 Smana opened 1 week ago
5
feat: add one line quickstart

#1813 drbh closed 3 days ago
1
feat: add vlm docs and simple examples

#1812 drbh closed 3 days ago
2
Fixing frequency penalty

#1811 martinigoyanes closed 3 days ago
6
Frequency penalty corrupting generations

#1810 martinigoyanes closed 2 days ago
1
Inference error for Mistral7b v-0.2 while deploying in Azure VM

#1809 a-myth-biswas opened 1 week ago
0
Use the generation config.

#1808 Narsil closed 1 week ago
1
Add support for Phi-3 Model

#1807 ChristophRaab closed 1 day ago
4
Take into account num_return_sequences to get multiple outputs

#1806 mauryaland opened 1 week ago
0
The EETQ quantization model cannot be performed locally

#1805 Gongzai-SURE opened 1 week ago
0