Closed o0oradaro0o closed 1 year ago
It could be this, but i forgot why i remove max_tokens It was causing issues somehow, i think the issue was that it was exceeding the context window.
I'm planning to use tokenizers as well later to count exactly the number of tokens https://github.com/FarisHijazi/PrivateGitHubCopilot/blob/master/PrivateGitHubCopilot/middleware.py#L39
Removing those lines solves the problem (at least for the models im using which have a very large context windows) thank you!!
Could be an environment issue but I've tried different sizes of model and my suggestions are always ~16 tokens long (when if i don't override the api and use regular co-piolet i would get much longer responses). if i paste my code into the ooba notbook i get larger responses with the same model so its something to do with how co-pilot is passing the request to ooba.
In the sample gif i can see you get a full function response, is there something i need to change in ooba or another setting i need to override in the extension? any ideas?