kvcache-ai / ktransformers

A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations
Apache License 2.0
653 stars 31 forks source link

[fix] Fix some gpu dequant function doesn't support multi gpu bug #88

Closed Azure-Tang closed 3 days ago

Azure-Tang commented 5 days ago
  1. Fix some gpu dequant function doesn't support multi gpu bug. #85 Tested on deepseekv2 IQ4_XS quantise type.

  2. Update outdated readme.