issues
search
triton-inference-server
/
fastertransformer_backend
BSD 3-Clause "New" or "Revised" License
411
stars
133
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Multi-instance inference fails in (n-1)/n runs (where n is a number gpus/instances)
#63
timofeev1995
opened
1 year ago
29
Memory usage not going up with model instances
#62
samipdahalr
opened
1 year ago
1
Can't deploy multiple version of BERT.
#61
ogis-uno
closed
1 year ago
10
Fastertransformer BERT returns wrong value in my environment.
#60
ogis-uno
closed
1 year ago
7
Can't re-load any T5 model after a first load/unload iteration
#59
Thytu
opened
1 year ago
5
build: ci
#58
Thytu
closed
1 year ago
1
Request to support GCS file path
#57
aasthajh
opened
1 year ago
2
docs: fix formating in README
#56
Thytu
closed
1 year ago
0
Is there any kind of caching?
#55
timofeev1995
closed
1 year ago
2
GPTJ end_id usage and behavior
#54
timofeev1995
closed
1 year ago
3
Unexpected behavior of batched inference of GPT-J
#53
AlekseyKorshuk
closed
1 year ago
24
Can't run multi-node GPTJ inference
#52
BDHU
opened
1 year ago
11
Adding option in identity_test.py client to supported decoupled=True
#51
pcastonguay
closed
1 year ago
0
Using GEMM files in fastertransformer_backend.
#49
SnoozingSimian
closed
1 year ago
3
Recommendation for the complete BERT model deployment on Triton + fastertransformer backend
#46
vblagoje
closed
1 year ago
4
GPT-J Preprocessing Incorrectly Tokenizes `<|endoftext|>`
#45
mitchellgordon95
opened
2 years ago
8
Streaming throwing queue.get() error
#44
rtalaricw
opened
2 years ago
2
GPT-NeoX throws Segmentation Fault (Signal 6)
#43
rtalaricw
closed
2 years ago
15
Byshiue patch 1
#42
byshiue
closed
2 years ago
0
Crash GPT-J if 'output0_len' is greater than 240.
#41
daemyung
closed
2 years ago
4
Crash GPT-J on mGPU
#40
daemyung
closed
2 years ago
10
Can you shader data.json to run perf_analyzer?
#39
daemyung
closed
2 years ago
2
Added fauxpilot changes
#38
lucataco
closed
2 years ago
0
Support mt5 (t5 v1.1)?
#37
hong8c
closed
1 year ago
3
Update CMakeLists.txt
#36
byshiue
closed
2 years ago
0
Does FT supports serving multiple models concurrently?
#35
PKUFlyingPig
closed
2 years ago
1
Failed to run FasterTransformer BERT Triton Backend with multiple instances.
#34
PKUFlyingPig
closed
2 years ago
21
Pipeline parallelism does not work for FasterTransformer BERT Triton Backend.
#33
PKUFlyingPig
closed
2 years ago
14
t5_guide.md shows 0 BLEU score
#32
hong8c
closed
2 years ago
4
feat: update v1.2
#31
byshiue
closed
2 years ago
0
Spelling
#30
jsoref
closed
2 years ago
1
FT backend crashes Triton server if batch size is too large
#29
moyix
opened
2 years ago
0
FasterTransformer freezes on 4 GPUs while running GPT with NCCL_LAUNCH_MODE=GROUP
#28
saramcallister
closed
2 years ago
8
FasterTransformer freezes on 4 GPUs while running GPT with NCCL_LAUNCH_MODE=GROUP
#27
saramcallister
closed
2 years ago
2
Streaming for fastertransformer using GPRC
#26
rtalaricw
closed
2 years ago
6
Results output same value with zero probability in GPTJ-6B
#25
rtalaricw
closed
2 years ago
16
Segmentation fault: address not mapped to object at address (nil)
#24
shimoshida
closed
2 years ago
8
Dynamic Batching with Different Sized Context (Ragged)
#23
jimwu6
closed
2 years ago
4
Merge v1.1 branch to main branch
#22
byshiue
closed
2 years ago
0
Allow mT5 support alongside T5
#21
RegaliaXYZ
closed
2 years ago
3
dynamic_batching with model config
#20
hajime9652
closed
2 years ago
2
FasterTransformer might freeze after few requests
#19
jimwu6
closed
2 years ago
4
does it also support general transformer encoders like bert?
#18
zhanghaoie
closed
2 years ago
3
Fix config.pbtxt file path in README
#17
jimwu6
closed
2 years ago
0
Error if Triton Binary is started early
#16
jimwu6
closed
2 years ago
2
will FT5.0 be supported ?
#15
520jefferson
closed
2 years ago
2
Install Go 1.16 with precompiled binary
#14
jimwu6
closed
2 years ago
1
update identity_test script
#13
yuanzhedong
closed
3 years ago
0
use nvidia-smi to track mem usage
#12
yuanzhedong
closed
3 years ago
0
Refine benchmark script with mem usage
#11
yuanzhedong
closed
3 years ago
0
Previous
Next