Closed kuvaus closed 1 year ago
Title: Add compatibility with new sampling algorithms in llama.cpp
Description: This pull request addresses issue https://github.com/nomic-ai/gpt4all-chat/issues/200#issue-1689677866 by adding compatibility with new sampling algorithms in llama.cpp.
Changes:
Implemented temperature sampling with repetition penalty as an alternative to the previous llama_sample_top_p_top_k sampling method.
llama_sample_top_p_top_k sampling
// Temperature sampling with repetition_penalty llama_sample_repetition_penalty( d_ptr->ctx, &candidates_data, promptCtx.tokens.data() + promptCtx.n_ctx - promptCtx.repeat_last_n, promptCtx.repeat_last_n, promptCtx.repeat_penalty); llama_sample_top_k(d_ptr->ctx, &candidates_data, promptCtx.top_k); llama_sample_top_p(d_ptr->ctx, &candidates_data, promptCtx.top_p); llama_sample_temperature(d_ptr->ctx, &candidates_data, promptCtx.temp); llama_token id = llama_sample_token(d_ptr->ctx, &candidates_data);
I will look at this, but will need to update the submodule at the same time otherwise this will break. But this helps a ton! Thanks @kuvaus !
Title: Add compatibility with new sampling algorithms in llama.cpp
Description: This pull request addresses issue https://github.com/nomic-ai/gpt4all-chat/issues/200#issue-1689677866 by adding compatibility with new sampling algorithms in llama.cpp.
Changes:
Implemented temperature sampling with repetition penalty as an alternative to the previous
llama_sample_top_p_top_k sampling
method.