Wrong model in the readme?

CoreyHayward commented 9 months ago

Your readme mentions to use the instruct model of DeepSeek-coder but isn't that model specifically not trained on FIM which you mention you are using?

cntseesharp commented 9 months ago

According to the DeepSeek Github, their model is trained for code insertion. The difference between base and instruct is usually that instruct is slightly better for following instructions, and I find that to be true in cases where you put a comment describing your desired code and let the model complete the rest for you.

I used 6.7B-instruct for screenshots and now use it daily for suggestions. This is only my recommendation and in no way it's mandatory to use instruct models with the extension. It just felt slightly better than base for me and that's it.

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
tokenizer = AutoTokenizer.from_pretrained("deepseek-ai/deepseek-coder-6.7b-base", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("deepseek-ai/deepseek-coder-6.7b-base", trust_remote_code=True, torch_dtype=torch.bfloat16).cuda()
input_text = """<｜fim▁begin｜>def quick_sort(arr):
    if len(arr) <= 1:
        return arr
    pivot = arr[0]
    left = []
    right = []
<｜fim▁hole｜>
        if arr[i] < pivot:
            left.append(arr[i])
        else:
            right.append(arr[i])
    return quick_sort(left) + [pivot] + quick_sort(right)<｜fim▁end｜>"""
inputs = tokenizer(input_text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_length=128)
print(tokenizer.decode(outputs[0], skip_special_tokens=True)[len(input_text):])

CoreyHayward commented 9 months ago

Interesting, the code snippet copied from their GitHub is using the base model and on Hugging Face the "How to Use" section only includes chat for the instruct version but the base includes FIM. What model options are you using to get the best results if you don't mind sharing?

cntseesharp commented 9 months ago

As I mentioned - for me it feels slightly better than the base version.

KoboldCpp is left on the default setting (except for context length): CuBLAS, mmq, ContextShift enabled

For inference I pass:

  "rep_pen": 1,
  "rep_pen_range": 256,
  "rep_pen_slope": 1,
  "temperature": 1,
  "tfs": 1,
  "top_a": 0,
  "top_k": 100,
  "top_p": 0.3,
  "typical": 1,

These produce quite accurate predictions, which are reproducible with any seed.

CoreyHayward commented 9 months ago

Thanks for your help!

cntseesharp / L.AI

Wrong model in the readme? #2