Open arthurv opened 5 days ago
Okay, we will update these packages in our next release.
Regarding quantizations, we now support multiple quantization methods(Qx_k and IQ4_XS) . What else would you like?
I have a system with 192GB DRAM and 48GB VRAM (2x 3090). Would it be able to handle 128k context with the specs? Would it be able to handle Q5_K_M or Q6_K_M?
Also I can only set max_new_tokens in local_chat, not the ktransformers server, and I can't set the total context size anywhere.
Hi,
I tried to install ktransformers on a clean install of Linux Mint 22 (based on Ubuntu 24.04), and there are a few things that I had to add:
Please update the pip dependencies.
Are there any plans to increase the number of quants supported for Deepseek-Coder-V2-Instruct-0724?