predibase lorax issues - Githubissues

predibase / lorax

Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs

https://loraexchange.ai

Apache License 2.0

1.85k stars 125 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

Refactor the lora load function for clarity and simplicity

#529 ajtejankar opened 1 day ago
0
Fix: Use correct local path when loading base model from s3

#528 fadebek opened 1 day ago
0
Hi Guys, now that TGI is back under Apache-2.0 license, will lorax merge their updates?

#527 SMAntony opened 1 day ago
0
Adding Whisper model

#526 Jeevi10 opened 2 days ago
0
Bug fix for illegal memory access error caused when running medusa lora and plain loras in parallel.

#525 ajtejankar closed 3 days ago
1
Added eager prefill option

#524 tgaddair closed 4 days ago
0
Fails hard on large requests

#523 yunmanger1 opened 1 week ago
2
WIP - Runpod integration

#522 noyoshi opened 1 week ago
0
Generating garbage output

#521 shreyansh26 opened 1 week ago
2
Disable fp8 kv cache for lovelace

#520 tgaddair closed 1 week ago
0
try s3 crt

#519 noyoshi opened 1 week ago
1
Add echo parameter in request

#518 dennisrall opened 1 week ago
0
integration test POC

#517 noyoshi opened 2 weeks ago
0
try out an integration test workflow

#516 noyoshi closed 2 weeks ago
0
docs: update development_env.md

#515 eltociear closed 1 week ago
0
Fix issue with GQA initialization for Qwen2

#514 arnavgarg1 closed 2 weeks ago
0
fix batching bug

#513 magdyksaleh closed 2 weeks ago
0
Important: In latest main, the server can not serve more than 1 user

#512 prd-tuong-nguyen opened 2 weeks ago
2
can't start my local llama3 model server with docker

#511 cheney369 opened 2 weeks ago
0
Fixed case where loaded lora adapter has no segments

#510 tgaddair closed 2 weeks ago
0
Add Support for AutoModelForSequenceClassification Models

#509 akkky02 opened 2 weeks ago
0
Add distilbert

#508 magdyksaleh closed 2 weeks ago
1
Bert to gpu

#507 magdyksaleh closed 2 weeks ago
0
feat: return usage in ChatCompletionStreamResponse

#506 GirinMan closed 2 weeks ago
0
Fail to load special token in phi-3

#505 prd-tuong-nguyen closed 2 weeks ago
0
Why are qlora (4bit) and lora (16bit) adapter file sizes the same?

#504 codybum closed 2 weeks ago
1
Add support for batching to embedder models

#503 tgaddair closed 3 weeks ago
0
can't run lorax with docker.

#502 cheney369 closed 3 weeks ago
1
(WIP) Support targeting the embedding layer for LoRA

#501 ajtejankar opened 3 weeks ago
1
AssertionError when using model "google/gemma-2b" with multi-gpus

#500 tritct opened 3 weeks ago
0
Fixed phi-3 with Su Rotary Embedding

#499 tgaddair closed 3 weeks ago
0
Revert AWQ to stable commit

#498 tgaddair closed 3 weeks ago
0
experimental support fp8

#497 flozi00 closed 3 weeks ago
0
Bump client to v0.6.1

#496 tgaddair closed 3 weeks ago
0
Add retries on common session errors for the client

#495 gyanesh-mishra closed 3 weeks ago
1
Fix quant cache OOM

#494 flozi00 closed 1 month ago
0
Update Makefile-awq

#493 flozi00 closed 3 weeks ago
1
Fix issue with Medusa batch load signature

#492 tgaddair closed 1 month ago
0
hqq upgrades

#491 flozi00 closed 3 weeks ago
2
add missed dtypes for 8bit kv cache

#490 flozi00 closed 1 month ago
1
Quickstart example not working

#489 jmorenobl opened 1 month ago
3
Bump lorax client v0.6.0

#488 tgaddair closed 1 month ago
0
chore: update infer.rs

#487 eltociear closed 1 month ago
1
int: Bump Lorax Client to 3.9

#486 gyanesh-mishra closed 1 month ago
1
Fail to run Phi-3

#485 prd-tuong-nguyen closed 3 weeks ago
9
`make install` insufficient for running llama3-8B-Instruct

#484 fozziethebeat opened 1 month ago
4
Quantized KV Cache

#483 flozi00 closed 1 month ago
0
Support jointly trained Medusa + LoRA adapters

#482 tgaddair closed 1 month ago
0
Add HTTP status codes to docs

#481 noyoshi opened 1 month ago
1
start porting latest tgi

#480 flozi00 closed 1 month ago
4