adjust inference quality by sampling

By sampling (generating multiple responses with varying temperature) and then filtering (e.g. pick majority answer) the quality of a model can be improved almost indefinitely. Even million of samples work. As many sampling strategies work in parallel, this would be a neat extension.

The use could simply set the number samples and therefore adjust the quality to his liking. Especially the 'no-limits' features makes dedicated inference well suited.

There are even more sophisticated sampling schemes like r-star from microsoft. But not all of them can be parallelized. Some of them are able to beat gpt zero (although I forgot the where I saw this). @eva-jagodic interesting, don't you think?

cortecs-ai / cortecs-py

Sampling strategies #5

adjust inference quality by sampling