QwenLM / CodeQwen1.5

CodeQwen1.5 is the code version of Qwen, the large language model series developed by Qwen team, Alibaba Cloud.
385 stars 22 forks source link

Too many TODO in the code completion scenario #11

Closed ChuangLee closed 2 months ago

ChuangLee commented 2 months ago

This greatly reduces the helpfulness of the model.

WeChatWorkScreenshot_a8fa0283-9c6e-4494-bf1b-a668b98ad3d4
cyente commented 2 months ago

Thank you for your feedback. We will conduct a further investigation into this issue and take the necessary steps to address it.

As a temporary workaround, you could consider adding keywords to bad_words_ids, to forcibly prevent the model from using these tokens. This method may help alleviate the issue.

for examples:

bad_words = ["//", " //"]
bad_word_ids = [TOKENIZER.encode(bad_word) for bad_word in bad_words]
input_text = """public class TestForLLM {
    //Write a method to compare two files for equality, and then test it.
    public static boolean compareFiles(String file1, String file2) {"""
model_inputs = TOKENIZER([input_text], return_tensors="pt").to(device)
outputs = MODEL.generate(model_inputs.input_ids, max_new_tokens=512, do_sample=False, bad_words_ids=bad_word_ids)[0]
output_text = TOKENIZER.decode(outputs[:], skip_special_tokens=True)