Closed aleeusgr closed 6 months ago
https://github.com/cursorless-dev/cursorless ❌ depends on Talon, closed source software.
I could find an open source solution and find if cursorless could be used with another engine
Deepseek-chat and deepseek coder are regarded as being the best coding models afaik https://huggingface.co/deepseek-ai/deepseek-coder-33b-instruct
How to choose a model: https://www.tensorops.ai/post/what-are-quantized-llms
While we expect that reducing the precision would result in reduction of the accuracy, Meta researchers have demonstrated that in some cases, not only does the quantized model demonstrate superior performance, but it also allows of reduced latency and enhanced throughput. The same trend can be observed when comparing an 8-bit 13B model with a 16-bit 7B model. In essence, when comparing models with similar inference costs, the larger quantized models can outperform their smaller, non-quantized counterparts. This advantage becomes even more pronounced with larger networks, as they exhibit a smaller quality loss when quantized.
You might be able to have a better experience using EXL2/GPTQ and with bigger models