issues
search
huggingface
/
text-generation-inference
Large Language Model Text Generation Inference
http://hf.co/docs/text-generation-inference
Apache License 2.0
7.93k
stars
859
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Add router name to /info endpoint
#1854
Wauplin
opened
1 hour ago
0
404 for Multi-modal docs
#1853
RonanKMcGovern
opened
2 hours ago
0
Serverless inference API endpoints fails to return logprobs via chat completions
#1852
ggbetz
opened
19 hours ago
0
Upgrading to rust 1.78.
#1851
Narsil
opened
20 hours ago
0
[WIP] MLPSpeculator speculative decoding support
#1850
JRosenkranz
opened
20 hours ago
1
Updating Phi3 (long context).
#1849
Narsil
closed
18 hours ago
0
Remove misleading warning (not that important nowadays anyway).
#1848
Narsil
closed
22 hours ago
0
UserWarning: You are using a Backend <class 'text_generation_server.utils.dist.FakeGroup'> as a ProcessGroup. This usage is deprecated since PyTorch 2.0
#1847
fxmarty
opened
23 hours ago
0
Do I need to additionally apply an inference template?
#1846
Semihal
opened
1 day ago
0
feat: improve message content chunks handling
#1845
drbh
opened
1 day ago
0
feat: prefer huggingface_hub in docs and show image api
#1844
drbh
closed
20 hours ago
1
Fix: "Fixing" double BOS for mistral too.
#1843
Narsil
closed
1 day ago
0
Unable to stop TGI after serving models
#1842
ponshane
opened
1 day ago
0
Adding scripts to prepare load data.
#1841
Narsil
closed
1 day ago
0
Dummy PR (secrets/credentials ?)
#1840
Narsil
closed
2 days ago
0
Version 2.0.2
#1839
Narsil
closed
2 days ago
0
Failing to start a TGI pod with 2 or more GPUs. Sharding fails.
#1838
jayteaftw
opened
2 days ago
0
Canno launch with error exllamav2_kernels not installed.
#1837
coderaBruce
opened
2 days ago
2
fix: split docs and start conceptual page
#1836
drbh
closed
2 days ago
1
feat: move allocation logic to rust
#1835
OlivierDehaene
opened
2 days ago
0
TGI crashes with complex json schemas provided as grammar without any information (on debug/trace level)
#1834
o1iv3r
opened
2 days ago
1
(chore): torch 2.3.0
#1833
Narsil
closed
2 days ago
0
Enable testing TGI on XPU
#1832
mfuntowicz
opened
2 days ago
0
Out of Memory Errors When Running text-generation-benchmark Despite Compliant Batch Token Limit
#1831
martinigoyanes
opened
3 days ago
2
Martinigoyanes fix frequency penalty
#1830
drbh
closed
3 days ago
1
Bugfix/add tools prompt
#1829
drbh
closed
3 days ago
2
Handle images in chat api
#1828
drbh
closed
3 days ago
1
Better graceful shutdown.
#1827
Narsil
closed
3 days ago
0
Process hangs in local run
#1826
Hojun-Son
opened
3 days ago
1
Add the missing `tool_prompt` parameter to Python client
#1825
maziyarpanahi
closed
3 days ago
1
The TGI loading model consumes all available gpus memory
#1824
IdleIdiot
opened
5 days ago
0
Python client: Extra slash in base_uri leads to failures in chat endpoint
#1823
kcarnold
opened
5 days ago
0
Support for ReFT
#1822
RonanKMcGovern
opened
5 days ago
0
Suport for InternVL-Chat-V1-5
#1821
Iven2132
opened
6 days ago
1
Changing the waiting_served_ratio default (stack more aggressively by default).
#1820
Narsil
closed
4 days ago
1
Planned/Potential of significant work
#1819
Narsil
opened
6 days ago
0
Fixing qwen2.
#1818
Narsil
closed
6 days ago
0
Dummy CI run.
#1817
Narsil
closed
6 days ago
0
2nd round of benchmark modifications (tiny adjustements to avoid overloading the host).
#1816
Narsil
closed
6 days ago
0
Blunder
#1815
Narsil
closed
6 days ago
0
Shared volume using mountpoint-s3, permissions issues
#1814
Smana
opened
1 week ago
5
feat: add one line quickstart
#1813
drbh
closed
3 days ago
1
feat: add vlm docs and simple examples
#1812
drbh
closed
3 days ago
2
Fixing frequency penalty
#1811
martinigoyanes
closed
3 days ago
6
Frequency penalty corrupting generations
#1810
martinigoyanes
closed
2 days ago
1
Inference error for Mistral7b v-0.2 while deploying in Azure VM
#1809
a-myth-biswas
opened
1 week ago
0
Use the generation config.
#1808
Narsil
closed
1 week ago
1
Add support for Phi-3 Model
#1807
ChristophRaab
closed
1 day ago
4
Take into account num_return_sequences to get multiple outputs
#1806
mauryaland
opened
1 week ago
0
The EETQ quantization model cannot be performed locally
#1805
Gongzai-SURE
opened
1 week ago
0
Next