issues
search
predibase
/
lorax
Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs
https://loraexchange.ai
Apache License 2.0
1.85k
stars
125
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Refactor the lora load function for clarity and simplicity
#529
ajtejankar
opened
1 day ago
0
Fix: Use correct local path when loading base model from s3
#528
fadebek
opened
1 day ago
0
Hi Guys, now that TGI is back under Apache-2.0 license, will lorax merge their updates?
#527
SMAntony
opened
1 day ago
0
Adding Whisper model
#526
Jeevi10
opened
2 days ago
0
Bug fix for illegal memory access error caused when running medusa lora and plain loras in parallel.
#525
ajtejankar
closed
3 days ago
1
Added eager prefill option
#524
tgaddair
closed
4 days ago
0
Fails hard on large requests
#523
yunmanger1
opened
1 week ago
2
WIP - Runpod integration
#522
noyoshi
opened
1 week ago
0
Generating garbage output
#521
shreyansh26
opened
1 week ago
2
Disable fp8 kv cache for lovelace
#520
tgaddair
closed
1 week ago
0
try s3 crt
#519
noyoshi
opened
1 week ago
1
Add echo parameter in request
#518
dennisrall
opened
1 week ago
0
integration test POC
#517
noyoshi
opened
2 weeks ago
0
try out an integration test workflow
#516
noyoshi
closed
2 weeks ago
0
docs: update development_env.md
#515
eltociear
closed
1 week ago
0
Fix issue with GQA initialization for Qwen2
#514
arnavgarg1
closed
2 weeks ago
0
fix batching bug
#513
magdyksaleh
closed
2 weeks ago
0
Important: In latest main, the server can not serve more than 1 user
#512
prd-tuong-nguyen
opened
2 weeks ago
2
can't start my local llama3 model server with docker
#511
cheney369
opened
2 weeks ago
0
Fixed case where loaded lora adapter has no segments
#510
tgaddair
closed
2 weeks ago
0
Add Support for AutoModelForSequenceClassification Models
#509
akkky02
opened
2 weeks ago
0
Add distilbert
#508
magdyksaleh
closed
2 weeks ago
1
Bert to gpu
#507
magdyksaleh
closed
2 weeks ago
0
feat: return usage in ChatCompletionStreamResponse
#506
GirinMan
closed
2 weeks ago
0
Fail to load special token in phi-3
#505
prd-tuong-nguyen
closed
2 weeks ago
0
Why are qlora (4bit) and lora (16bit) adapter file sizes the same?
#504
codybum
closed
2 weeks ago
1
Add support for batching to embedder models
#503
tgaddair
closed
3 weeks ago
0
can't run lorax with docker.
#502
cheney369
closed
3 weeks ago
1
(WIP) Support targeting the embedding layer for LoRA
#501
ajtejankar
opened
3 weeks ago
1
AssertionError when using model "google/gemma-2b" with multi-gpus
#500
tritct
opened
3 weeks ago
0
Fixed phi-3 with Su Rotary Embedding
#499
tgaddair
closed
3 weeks ago
0
Revert AWQ to stable commit
#498
tgaddair
closed
3 weeks ago
0
experimental support fp8
#497
flozi00
closed
3 weeks ago
0
Bump client to v0.6.1
#496
tgaddair
closed
3 weeks ago
0
Add retries on common session errors for the client
#495
gyanesh-mishra
closed
3 weeks ago
1
Fix quant cache OOM
#494
flozi00
closed
1 month ago
0
Update Makefile-awq
#493
flozi00
closed
3 weeks ago
1
Fix issue with Medusa batch load signature
#492
tgaddair
closed
1 month ago
0
hqq upgrades
#491
flozi00
closed
3 weeks ago
2
add missed dtypes for 8bit kv cache
#490
flozi00
closed
1 month ago
1
Quickstart example not working
#489
jmorenobl
opened
1 month ago
3
Bump lorax client v0.6.0
#488
tgaddair
closed
1 month ago
0
chore: update infer.rs
#487
eltociear
closed
1 month ago
1
int: Bump Lorax Client to 3.9
#486
gyanesh-mishra
closed
1 month ago
1
Fail to run Phi-3
#485
prd-tuong-nguyen
closed
3 weeks ago
9
`make install` insufficient for running llama3-8B-Instruct
#484
fozziethebeat
opened
1 month ago
4
Quantized KV Cache
#483
flozi00
closed
1 month ago
0
Support jointly trained Medusa + LoRA adapters
#482
tgaddair
closed
1 month ago
0
Add HTTP status codes to docs
#481
noyoshi
opened
1 month ago
1
start porting latest tgi
#480
flozi00
closed
1 month ago
4
Next