Open dzhengxin opened 5 months ago
chat正常有response, 但generate结果打印出来 token_id只比输入多了一个 5, 解码后为空 chat单句推理正常,改成generate进行单句推理/批量推理结果都为空
inputs = self._tokenizer(text_list, padding=True, return_tensors="pt") inputs = inputs.to(self._model.device) outputs = self._model.generate( **inputs, max_length=512, do_sample=False)
两种decode都为空 llm_outputs = list() for j, output in enumerate(outputs.tolist()): index = len(inputs["input_ids"][j]) output1 = output[index:] response = self._tokenizer.decode(output1, skip_special_tokens=True) llm_outputs.append(response)
llm_outputs2 = self._tokenizer.batch_decode(outputs)
Is there an existing issue for this?
Current Behavior
chat正常有response, 但generate结果打印出来 token_id只比输入多了一个 5, 解码后为空 chat单句推理正常,改成generate进行单句推理/批量推理结果都为空
inputs = self._tokenizer(text_list, padding=True, return_tensors="pt") inputs = inputs.to(self._model.device) outputs = self._model.generate( **inputs, max_length=512, do_sample=False)
两种decode都为空 llm_outputs = list() for j, output in enumerate(outputs.tolist()): index = len(inputs["input_ids"][j]) output1 = output[index:] response = self._tokenizer.decode(output1, skip_special_tokens=True) llm_outputs.append(response)
llm_outputs2 = self._tokenizer.batch_decode(outputs)