huggingface / llm-vscode

LLM powered development for VSCode
Apache License 2.0
1.23k stars 133 forks source link

Add SantaCoder model to templates #90

Open YuryYakhno opened 1 year ago

YuryYakhno commented 1 year ago

SantaCoder model uses different (but very similar) special tokens, comparing to StarCoder model. The current settings contain template only for StarCoder, so it appears to be logical just to change "bigcode/starcoder" to "bigcode/santacoder" in "Model ID or Endpoint" setting. But actually it is not enough, because SantaCoder tokens start with "fim-", while StarCoder uses tokens starting with "fim_". It is hard to notice by brief settings overview. If wrong FIM tokens are used, it leads to improper work of SantaCoder: "fim_..." tokens are parsed as text, and the model adds them to the output from time to time.

This issue was discussed in SantaCoder's model page. To prevent this issue in the future without changing the SantaCoder's interface, I propose to add a separate template for SantaCoder with proper special tokens.