turboderp / exllamav2

A fast inference library for running LLMs locally on modern consumer-class GPUs
MIT License
3.2k stars 236 forks source link

Kalomaze's Quadratic Sampling #317

Closed AAbushady closed 5 months ago

AAbushady commented 5 months ago

Quadratic Sampling

test157t commented 5 months ago

I don't have any capacity to make decisions here, but i approve.

turboderp commented 5 months ago

I think an exponent of 2.3 works better than 2. Change my mind.

AAbushady commented 5 months ago

I know I know, but this one is good, a lot of people have tested it. I do need to make a change after I'm done with work today though, order of operations is sub-optimal for quad sampling application. So more to come there.

Ph0rk0z commented 5 months ago

Make it an option.

AAbushady commented 5 months ago

Okay, updates are made!

frammiie commented 5 months ago

This one is huge boss! 💪

test157t commented 5 months ago

image

turboderp commented 5 months ago

So there. With a secret cubic sampling option for those brave enough to seek it out.