baichuan-inc / Baichuan2

A series of large language models developed by Baichuan Intelligent Technology
https://huggingface.co/baichuan-inc
Apache License 2.0
4.03k stars 289 forks source link

对于特定输入baichuan2-7b-base模型会输出为空 #322

Open guankaisi opened 6 months ago

guankaisi commented 6 months ago

经过我的实验,当我设置input为

{'input_ids': tensor([[43707,  4007,  1833, 10329,  1836,  7293,  3799,    65, 22792,  2169,
          4007,  1833,  2218,  2079,    65, 92413, 11721,  7293,  2835,  1754,
            66,     5, 92830,  4007,  1833, 23779,     5, 92754, 92561, 92611,
          2019,  4913, 71113,  4594, 66293,  2019,    65,  2026,  8172, 93038,
          9585,  1880, 21458, 92385, 30881, 21635,    65, 22351, 46699,  2935,
            70,     5,  1541,  3432, 92415, 92754, 92561, 92611, 18194,  3698,
         40540,  2669, 13569,    65, 92754, 92561, 92611, 45715, 26288,    69,
         29029,    69, 61612, 92377,  7312, 49077,  9572, 43593,    65,  8668,
         87647, 10849,  4126,  3723, 13169,    66,  2195,    65,  3235,  1914,
         72030,  2983, 20430, 92754, 92561, 92611,  3088,    65,  3309,  5661,
            69, 93092, 92461, 93149,    69, 95555, 92681,    69, 93461, 93492,
            69, 58660,    69, 92775, 93475, 92457,    65, 92492, 93349, 92877,
         93848,    69, 12213,  3088, 65921,  3403,  2637, 45412,    66,  2001,
         21218,  5076, 46208, 92435, 49369, 92754, 92561, 92611,  2019,  5560,
            66,     5,  1541,  2190,  3723,  2814,    65, 92754, 92561, 92611,
          3088,  4594, 24552,  4040,  9418, 92333,  8367, 92385, 19692,    65,
         16095, 13203,  9418, 26288,  4897,  3799, 57644,    66, 92355,  5656,
          1827,    65,  3723,  1583, 15424, 14685,    65, 19796, 93014, 54692,
          3501, 13170,    89, 92355,  1836, 30304, 77052,    65,  3723, 15796,
         93014, 14685,    65,  3668, 26288,  9037, 92385,  9735, 16476,    66,
          2573,    65, 92754, 92561, 92611,  3088, 44696,  3087,    65, 19359,
         12272, 92385,  9329,  4561, 92435, 16554,    66,     5,  1541, 92754,
         92561, 92611,  3088,  4463, 48956, 14041, 27783, 92508, 92668, 93665,
         47428, 17206,  8195,  7636,  2149,    66, 89749,  8018,  2438, 44759,
            69,  2019, 92349,  6902,    69,  9735,  4086, 84077,  7125,    65,
         92754, 92561, 92611,  3088,  2401,  7615, 27783, 72288,  9418, 92333,
         85100,    65, 92649,  3723, 46959, 92527, 27550,  2137,    66,  3432,
         20531, 18194,  2849, 19462, 12732,    65, 92754, 92561, 92611,  3088,
         25284,  5475, 92593, 93665, 47428, 24489,    66,  1411, 25616,  2079,
         59645, 10747, 92401, 92754, 92561, 92611, 47428,  8519,  2401, 67138,
         64512,    65,  2360, 20531, 92333, 10426,  2137,    68, 92401, 89050,
            66,  7014, 25616,  2079, 59645, 24318, 92360, 92754, 92561, 92611,
         47428,  8519,  2401, 67138, 64512,    65,  2360, 20531, 92333, 10426,
          2137,    68, 92361, 89050,    66,  1869,     5, 92830,  1754, 23779,
             5, 92754, 92561, 92611, 47428,  8519,  2401, 67138, 64512,    65,
          2360, 20531, 92333, 10426,  2137,    68,     5,  2169,  6177,  4007,
          1833,    65,  2193,  7293, 92376,    70]]), 'attention_mask': tensor([[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1]])}

时,使用官方给出的推理代码,

pred = model.generate(**inputs, max_new_tokens=64,repetition_penalty=1.1)
print(tokenizer.decode(pred.cpu()[0], skip_special_tokens=True))

模型直接回复为eos,打印为空,请问这是什么原因嗯? 环境:transformers==4.29.1

mmmans commented 6 months ago

Can you provider the corresponding stentence of input ?

guankaisi commented 6 months ago

这个用baichuan2的tokenizer decode一下就可以,不光是这一个我发现好多输入baichuan2-7b-base都会出现恢复为空,不知道是什么情况

windygoo commented 2 months ago

any update?