Closed fabianlim closed 2 months ago
awesome, great results @fabianlim
Indeed, awesome results @fabianlim !
@wynterl @raghukiran1224 the loss for BNB + fused ops looks problematic. ~Needs more debugging~, Ok i found that its because Granite has a bias in the Linear, but the FOAK kernels do not support bias. This just requires some minor (but tedious) modifications
In this PR we update the benchmarks for
GraniteCausalLM
Note this PR requires the following dependency updates
transformers>=4.45
: forGraniteCausalLM
accelerate>=0.34.1
: required fortransformers>=4.45
ifGraniteCausalLM
is needed.trl > 0.11.1
: when using baseline bnb, requires this fix for a bug that was introduced intransformers==4.45
https://github.com/huggingface/trl/pull/2089bitsandtbyes==0.43.3
: it seems that the later versions give segmentation fault errorsKnown issues with quant peft
low_cpu_mem_mode
)low_cpu_mem_mode
)Performance
Overall impressive improvements with kernels.
FULL FT
PEFT
Quantized Peft (BNB)