kvcache-ai / ktransformers

A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations
Apache License 2.0
741 stars 39 forks source link

[Fix] Fix problem that ktransformers cannot offload whole layer in cpu #62

Closed Azure-Tang closed 2 months ago

Azure-Tang commented 2 months ago
  1. Fix bug that ktransformers cannot offload whole layer in cpu.
  2. Update DeepseekV2‘s multi gpu yaml examples to evenly allocate layers.
  3. Update Docker file.