issues
search
huggingface
/
text-generation-inference
Large Language Model Text Generation Inference
http://hf.co/docs/text-generation-inference
Apache License 2.0
8.06k
stars
881
forks
source link
issues
Oldest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Pydantic validation error re: ChoiceDelta (text_generation/types.py)
#1927
mphipps2
opened
8 hours ago
0
version in docker not correct
#1926
arunpatala
opened
17 hours ago
0
docs: Fix grafana dashboard url
#1925
edwardzjl
opened
23 hours ago
0
ROCm: make CK FA2 default instead of Triton
#1924
fxmarty
closed
1 day ago
2
ROCm: Make CK FA2 default
#1923
fxmarty
closed
1 day ago
2
Fix TGI issues with ROCm
#1921
fxmarty
closed
3 days ago
1
Fix TunableOp bug
#1920
fxmarty
closed
3 days ago
1
Fix cudagraph bug
#1919
fxmarty
closed
3 days ago
2
Update grafana template
#1918
fxmarty
closed
3 days ago
0
Fixing the download strategy for ibm-fms
#1917
Narsil
closed
2 days ago
0
Update documentation version 1.4 -> 2.0.3
#1916
fxmarty
closed
3 days ago
2
Removing some unused code.
#1915
Narsil
closed
3 days ago
0
LlavaNext Model cannot be started
#1914
paulcx
opened
3 days ago
0
Wrong validations on `Parameters` in TGI python library
#1913
Jason-CKY
opened
4 days ago
0
feat: experimental python packaging and interface
#1912
drbh
opened
4 days ago
3
Fixing signals.
#1910
Narsil
closed
4 days ago
0
Types.
#1909
Narsil
closed
4 days ago
0
Add TGI monitoring guide through Grafana and Prometheus
#1908
fxmarty
closed
3 days ago
2
Phi-3 not starting on TGI 2.0.3 in kubernetes cluster
#1907
Cyb4Black
opened
4 days ago
1
Fixing types.
#1906
Narsil
closed
4 days ago
0
Docs missing for LLaVA NeXT Model
#1905
RonanKMcGovern
opened
4 days ago
0
Clarification and supplement to the online docs example
#1904
paulcx
opened
4 days ago
0
error: unexpected argument ‘–max-input-tokens’ found
#1903
moruga123
opened
5 days ago
1
[Bug Fix] Update torch import reference in bnb quantization
#1902
DhruvSrikanth
closed
5 days ago
0
metric: tgi_request_total increments by 2 upon every request
#1901
thenu97
opened
5 days ago
0
Document Request
#1900
oroojlooy
closed
4 days ago
2
StarCoder2 AWQ does not work correctly
#1899
johan12345
opened
5 days ago
0
Removing accepted ids in the regular info logs, downgrade to debug.
#1898
Narsil
closed
5 days ago
0
HF web service streaming response differs from OpenAI, breaking clients
#1896
dluc
opened
6 days ago
0
Pali gemma modeling
#1895
drbh
closed
4 days ago
0
[CI] OpenAI function calling compatible support
#1894
drbh
closed
6 days ago
1
LoRA Adapter from local model are leading to error
#1893
philschmid
opened
6 days ago
0
Correct 'using guidance' link
#1892
brandon-lockaby
closed
6 days ago
0
TGI 2.0.2 CodeLlama error `piece id is out of range.`
#1891
philschmid
opened
6 days ago
0
Fixing truncation.
#1890
Narsil
closed
6 days ago
0
Add GPT-2 with flash attention
#1889
danieldk
closed
5 days ago
1
OpenAI function calling compatible support
#1888
phangiabao98
closed
4 days ago
5
Router /v1/chat/completions not compatible with openai spec
#1887
phangiabao98
opened
6 days ago
1
Add: Support for the Falcon2 11B architecture
#1886
Nilabhra
closed
6 days ago
0
Min P generation parameter
#1885
LawrenceGrigoryan
opened
1 week ago
2
Add GPT-2 with flash attention
#1884
danieldk
closed
6 days ago
0
Question about KV cache
#1883
martinigoyanes
closed
5 days ago
3
Granite support?
#1882
Narsil
closed
1 week ago
0
SnapKV support
#1881
icyxp
opened
1 week ago
0
Logging has no formating when using docker enviroment instead of command
#1880
onel
opened
1 week ago
0
Multi-Model Endpoint support in Sagemaker
#1878
Najib-Haq
opened
1 week ago
0
concurrent requests permit limit is broken
#1877
oOraph
opened
1 week ago
0
text generation details not working when stream=False
#1876
uyeongkim
opened
1 week ago
1
How to share memory among 2 GPUS for distributed inference?
#1875
martinigoyanes
opened
1 week ago
10
Automatic NUMA binding
#1874
fxmarty
opened
1 week ago
0
Next