THUDM / ChatGLM2-6B

ChatGLM2-6B: An Open Bilingual Chat LLM | 开源双语对话语言模型
Other
15.71k stars 1.85k forks source link

build_inputs_with_special_tokens #277

Open fxb392 opened 1 year ago

fxb392 commented 1 year ago

Is there an existing issue for this?

Current Behavior

def build_inputs_with_special_tokens( self, token_ids_0: List[int], token_ids_1: Optional[List[int]] = None ) -> List[int]: prefix_tokens = self.get_prefix_tokens() token_ids_0 = prefix_tokens + token_ids_0 if token_ids_1 is not None: token_ids_0 = token_ids_0 + token_ids_1 + [self.get_command("")] return token_ids_0

这个方法,为什么返回[gMASK],\<sop> sentence1 sentence2 \<eos> 不应该是:sentence1,[gMASK],\<sop> sentence2\<eop> 吗?

Expected Behavior

No response

Steps To Reproduce

Environment

- OS:
- Python:
- Transformers:
- PyTorch:
- CUDA Support (`python -c "import torch; print(torch.cuda.is_available())"`) :

Anything else?

No response

supdizh commented 1 year ago

对!同样有这个问题,难道开始不应该是"<bos>"吗,然后这个"<eop>"应该是模型的输出吧

supdizh commented 1 year ago

这么猜测不知道对不对,来个大佬指导下: gmask这玩意放句子开头也可以,只要训练和使用时一致就可以了。 参考其他结构模型有special token的多任务微调,一般也放前面,可能是放前面"位置比较稳定"? 而<sop>这里应该就是当<bos>来用了