punica-ai punica issues

punica-ai / punica

Serving multiple LoRA finetuned LLM as one

https://arxiv.org/abs/2310.18547

Apache License 2.0

883 stars 40 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

Error when installing package from source

#52 Conless opened 1 month ago
0
fix(bgmv): write shared_memory y_warpsize only when threadIdx.x == 0

#51 menggeliu1205 opened 1 month ago
1
Rotary_pos_emb Miss?

#50 chenhongyu2048 closed 2 months ago
3
RunTimeError: output must be a cuda tensor

#49 iskander-sauma-assessio opened 2 months ago
2
[Question] How to avoid matrices conflict

#48 gyliu513 opened 3 months ago
0
improve bgmv expand kernel performance

#47 yyccli closed 3 months ago
1
pip install failed with cuda 12.2

#46 hayleyhu closed 3 months ago
3
Support Continuous Batching?

#45 Wisley1998 closed 3 months ago
1
[Feature Request] Add support for SM75

#44 sleepwalker2017 opened 3 months ago
4
Support for H100 GPUs?

#43 LorrinWWW opened 4 months ago
1
why cuda arch should >= 8.0?

#42 yyccli closed 4 months ago
5
Confusion about offset += 8

#41 carryyu closed 4 months ago
2
BGMV performs better than SGMV?

#40 ghost closed 4 months ago
1
Inquiry on cuda memory across processes

#39 mozizhao opened 5 months ago
0
Support qwen?

#38 cgq0816 opened 5 months ago
1
chore(master): release 1.1.1

#37 github-actions[bot] opened 5 months ago
0
ImportError: cannot import name 'BatchedKvCache' from 'punica'

#36 VisionwWu opened 5 months ago
1
Fixed deadlock in sgmv_shrink kernel caused by skewed segments

#35 tgaddair closed 5 months ago
3
chore(master): release 1.1.0

#34 github-actions[bot] closed 6 months ago
1
bug: sgmv_shrink does not support CUDA graph tracing

#33 tgaddair closed 6 months ago
9
chore(master): release 1.0.3

#32 github-actions[bot] closed 6 months ago
1
chore(master): release 1.0.2

#31 github-actions[bot] closed 6 months ago
1
chore(master): release 1.0.1

#30 abcdabcd987 closed 6 months ago
1
chore(master): release 1.0.0

#29 github-actions[bot] closed 6 months ago
1
chore: bootstrap releases for path: .

#28 abcdabcd987 closed 6 months ago
1
Question about performance

#27 jcao-ai closed 6 months ago
2
About scheduler

#26 luzai opened 6 months ago
2
flashinfer shrink vs cutlass

#25 YLGH closed 7 months ago
6
A question on serving multiple-lora models in one server

#24 zihaolucky closed 6 months ago
3
Error when inferencing from the Punica format

#23 bibekyess closed 7 months ago
2
[Bugfix] Fix the incorrect offset computation for SGMV shrink

#22 yzh119 closed 7 months ago
1
Choosing adapters on inference

#21 BaiStone2017 closed 6 months ago
2
fix: kernel launch failure due to smem overflow

#20 jcao-ai closed 7 months ago
2
Multi GPU and Multi Node solution

#19 luciferlinx101 opened 7 months ago
4
ci: add gpu tests

#18 abcdabcd987 closed 7 months ago
0
The smallest rank supported is 16？

#17 jcao-ai closed 7 months ago
5
ModuleNotFoundError: No module named 'rich'

#16 luciferlinx101 closed 7 months ago
2
Support for GPT-NEOX models

#15 bibekyess opened 7 months ago
0
Update README.md

#14 luciferlinx101 closed 7 months ago
1
pip install -v --no-build-isolation . is giving errors

#13 luciferlinx101 closed 7 months ago
5
ModuleNotFoundError: No module named 'punica.ops._kernels'

#12 bibekyess closed 7 months ago
7
sgmv_cutlass calculate wrong output

#11 harryhan618 opened 7 months ago
8
[FIX] Benchmark build script

#10 jeromeku closed 7 months ago
1
License to use this library?

#9 Algy closed 7 months ago
4
Error installing package

#8 jkl375 closed 7 months ago
7
Why punica is faster than pretrained LLM?

#7 shushengyuan closed 7 months ago
4
Question about the work

#6 Gorluxor closed 7 months ago
2
Add support for running on Colab

#5 dzlab opened 7 months ago
4
Update README.md

#4 eltociear closed 7 months ago
1
does the Loras all trained start from a base model.

#3 lucasjinreal closed 7 months ago
1