kvcache-ai / ktransformers

A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations
Apache License 2.0
741 stars 39 forks source link

Release v0.1.2 #36

Closed UnicornChan closed 3 months ago

UnicornChan commented 3 months ago
  1. Support windows native.
  2. Support multiple GPU
  3. Support qlen > 1.
  4. Support new model: mixtral 87B and 822B
  5. Support q2k, q3k dequant on gpu.
  6. Support github action to create pre compile package
  7. Fix some bugs