issues
search
IBM
/
text-generation-inference
IBM development fork of https://github.com/huggingface/text-generation-inference
Apache License 2.0
57
stars
30
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
:fire: remove cuda-runtime entirely
#60
joerunde
closed
7 months ago
0
🔥 Remove our exllama code because we use auto-gptq vendored kernels
#59
tjohnson31415
closed
8 months ago
0
feat: update ibm_fms engine to support variant config overrides
#58
tjohnson31415
closed
8 months ago
0
Add `return_offsets` and `truncate_input_tokens` tokenize API options
#57
njhill
closed
7 months ago
0
Update python and rust dependencies
#56
njhill
closed
7 months ago
0
feat: add OpenTelemetry tracing support to router
#55
declark1
closed
7 months ago
0
feat: Make tokenizer `add_special_tokens` option configurable
#54
njhill
opened
8 months ago
0
:fire: delete _all_ the apt packages
#53
joerunde
closed
8 months ago
2
:arrow_up: upgrade nix and other deps
#52
joerunde
closed
8 months ago
4
Incoporate Marlin for GPTQ checkpoints into tgis_native
#51
cyang49
closed
7 months ago
3
Dockerfile: use base instead of cuda-runtime as base for server-release
#50
dtrifiro
closed
8 months ago
3
feat: handle safetensors conversion for unshared incomplete tensors
#49
tjohnson31415
closed
8 months ago
2
Update launcher to auto-convert fast tokenizer
#48
declark1
closed
8 months ago
0
fix CUDA OOM error when loading large models for hf_transformers engine
#47
dtrifiro
closed
8 months ago
3
:construction_worker: put python installs in separate stage
#46
joerunde
closed
8 months ago
1
Test AMD ROCm build
#45
maxdebayser
closed
3 months ago
0
Apply rustfmt, clippy suggestions, and other cleanups
#44
declark1
closed
8 months ago
0
ci: Push text-gen-server release images to Quay.io
#43
ckadner
closed
8 months ago
0
ci: Cleanup old build cache images
#42
ckadner
closed
8 months ago
0
:bug: always set word embeddings
#41
joerunde
closed
8 months ago
0
chore: Small formatting change in `cli.py` to avoid long line
#40
njhill
closed
8 months ago
0
Fix OOM due to large prompt cache
#39
joerunde
closed
8 months ago
0
chore: Remove flash-attention v1
#38
ckadner
closed
8 months ago
1
[WIP] ci: Use build cache
#37
ckadner
closed
8 months ago
0
Update various rust and python dependencies
#36
njhill
closed
8 months ago
0
Also convert .index.json file
#35
joerunde
closed
8 months ago
0
🎨 add SIGBUS startup failure warning message
#34
joerunde
closed
8 months ago
0
feat: allow configuration of the max soft prompt length
#33
joerunde
closed
8 months ago
0
🐛 truncate termination log file
#32
joerunde
closed
8 months ago
0
Update rust to 1.76
#31
joerunde
closed
8 months ago
0
Add linux cross-compilation option for launcher
#30
joerunde
closed
8 months ago
0
chore: Create custom CodeQL configuration
#29
ckadner
closed
8 months ago
1
pre-commit: add ruff hook
#28
dtrifiro
closed
6 months ago
2
gha: add dependabot config
#27
dtrifiro
closed
6 months ago
2
add pre-commit
#26
dtrifiro
closed
6 months ago
5
chore: Make help
#25
ckadner
closed
8 months ago
5
chore: Add issue templates
#24
ckadner
closed
9 months ago
1
test: Separate test and build workflows
#23
ckadner
closed
9 months ago
2
re-enable http generate endpoint
#22
dtrifiro
closed
4 months ago
2
deps: bump optimum to 1.16.1
#21
dtrifiro
closed
10 months ago
1
enable http generate endpoint
#20
dtrifiro
closed
10 months ago
3
Doc Request: PREFIX_STORE_PATH in README
#19
gabe-l-hart
opened
10 months ago
1
Add example for running inference locally
#18
helena-intel
opened
11 months ago
0
OpenVINO integration for CausalLM models
#17
helena-intel
opened
11 months ago
0
enabling Intel(R) Extension for PyTorch*
#16
kta-intel
opened
11 months ago
1
enabling Intel Extension for Pytorch
#15
kta-intel
closed
11 months ago
0
Cherry-pick of a few Dockerfile updates
#14
Xaenalt
closed
8 months ago
1
Optimizing Latency in FlashLlamaLayer.mlp During First Token Generation
#13
truenorth8
closed
1 year ago
0
test: Free up disk space for GH actions
#12
ckadner
closed
9 months ago
5
GitHub action failing with "No space left on device" error
#11
ckadner
closed
9 months ago
0
Previous
Next