punica-ai / punica

Serving multiple LoRA finetuned LLM as one
https://arxiv.org/abs/2310.18547
Apache License 2.0
883 stars 40 forks source link

fix: kernel launch failure due to smem overflow #20

Closed jcao-ai closed 7 months ago

jcao-ai commented 7 months ago

Ifsmem > 46 * 1024, we should manually call cudaFuncSetAttribute(..., cudaFuncAttributeMaxDynamicSharedMemorySize, ...) to ensure kernel is launched successfully.

codecov[bot] commented 7 months ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Comparison is base (1495038) 30.57% compared to head (060820f) 30.57%.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #20 +/- ## ======================================= Coverage 30.57% 30.57% ======================================= Files 9 9 Lines 677 677 ======================================= Hits 207 207 Misses 470 470 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

abcdabcd987 commented 7 months ago

Awesome! Thanks! I'll add test cases.