Closed binarycrayon closed 1 week ago
Whether a library can be integrated depends on whether there is a demand and whether the performance of this library has advantages. Are you willing to use PyTorch Benchmark to verify the performance of this library https://pytorch.org/tutorials/recipes/recipes/benchmark.html? I currently feel that it probably does not have any advantage compared to our existing implementation, and there are many similar libraries, such as https://github.com/AlibabaPAI/FLASHNN, https://github.com/FlagOpen/FlagGems. What is its advantage over other libraries? We will be very cautious about whether to introduce these unnecessary dependencies.
@zhyncs got it, thanks for the response. I can look into pytorch benchmark of the said library and post them here. thank you!
Will benchmark on A100 80GB be sufficient enough?
It's better to benchmark both A100 and H100. Thanks!
Liger Kernel team here! Currently Liger kernel is optimized for training (like linear+cross entropy layer to slash memory), so i believe other inference specific libs might do better than us.
Hi thanks for the comment, that makes sense. I will close it for now. We can open new case if we want to revisit it in the future.
Checklist
Motivation
liger triton kernel is a one liner patch to huggingface models. It provides inference speed up and memory reduction.
Related resources
https://github.com/linkedin/Liger-Kernel