Open CyberShadow opened 4 months ago
The usual name for this feature is "token healing". I agree that it would be nice to have it supported here.
@ggerganov I'd like to try working on it as my first issue!
Ok. This can be demonstrated in one of the examples. One way would be to add it to main
or simple
+ extend llama_sampling_sample
with the necessary functionality
Hi @ilyannn, do you still want to work on this? I've created a draft PR (#7028) that demonstrates token healing, but I still haven't added it to main
or server
. We can collaborate on that, if you'd like.
@mare5x Sorry, I have not actually started so please don't wait for me. I'll try to take a look at your PR this week though and will be happy to help in any way I can.
Prerequisites
Please answer the following questions for yourself before submitting an issue.
Feature Description
Hi! I am experimenting with using llama.cpp as a general-purpose code completion backend, similar to TabNine.
I am encountering a small problem: if the completion prompt ends mid-word, the results are not very accurate. For example, for a prompt such as
Five, Four, Thre
[sic], the model will often ignore the typo and suggest, Two
(formingThre, Two
).I think, as an option to the
/completion
server API, the following optional behavior would be useful:Thanks!