kvcache-ai / ktransformers

A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations
Apache License 2.0
741 stars 39 forks source link

[fix] Fix some gpu dequant function doesn't support multi gpu bug #88

Closed Azure-Tang closed 2 months ago

Azure-Tang commented 2 months ago
  1. Fix some gpu dequant function doesn't support multi gpu bug. #85 Tested on deepseekv2 IQ4_XS quantise type.

  2. Update outdated readme.