CLUEbenchmark / FewCLUE

FewCLUE 小样本学习测评基准,中文版
https://arxiv.org/abs/2107.07498
492 stars 72 forks source link

FewCLUE LM-BFF can't run in prompt-demo mode #11

Closed o0mahan0o closed 3 years ago

o0mahan0o commented 3 years ago

error: TypeError: int() argument must be a string, a bytes-like object or a number, not 'list'

TASK=tnews TYPE=prompt-demo TASK_EXTRA="--max_seq_len 512 --first_sent_limit 500 --other_sent_limit 500"

[features.input_ids] in your code: [101, 711, 784, 720, 2434, 746, 4670, 686, 1348, 4917, 5273, 5961, 4670, 686, 8043, 2434, 746, 4670, 686, 3198, 3249, 6858, 4636, 1998, 4495, 3833, 3717, 2398, 1168, 2419, 2582, 3416, 8043, 1962, 103, 103, 8013, 102, 1045, 3471, 1762, 1278, 7368, 7305, 1366, 2939, 702, 1968, 8024, 1962, 1962, 2105, 7556, 8024, 679, 1168, 676, 2399, 8024, 2157, 7027, 3341, 749, 702, 6570, 2669, 2060, 1967, 1962, [3125, 752], 103, 8013, 102, 4507, 5464, 3153, 2682, 1168, 7942, 7962, 4331, 4638, 3125, 752, 928, 679, 928, 4507, 872, 1962, [3152, 1265], 103, 8013, 102, 823, 5799, 2025, 185, 5801, 823, 1046, 1469, 1395, 6612, 2209, 185, 6930, 6801, 8024, 4991, 1762, 3198, 2213, 722, 2330, 8024, 1316, 3295, 711, 4263, 856, 6822, 2212, 1812, 1962, [2031, 727], 103, 8013, 102, 6593, 3172, 3722, 185, 3683, 843, 1377, 6458, 3221, 671, 807, 2342, 3215, 8024, 794, 8108, 2259, 5273, 1168, 4385, 1762, 1962, [860, 5509], 103, 8013, 102, 2582, 720, 7309, 1166, 782, 955, 7178, 8043, 1962, [6568, 5307], 103, 8013, 102, 1068, 754, 743, 2791, 117, 872, 3187, 7557, 7309, 1166, 782, 117, 1372, 7444, 2462, 3926, 6821, 123, 702, 7309, 7579, 1962, [2791, 772], 103, 8013, 102, 7023, 4500, 7478, 2824, 6770, 2466, 6756, 6716, 8024, 124, 119, 10973, 11051, 116, 129, 8488, 6981, 4510, 2094, 6774, 1221, 5143, 5320, 8024, 5436, 2255, 6632, 2275, 1980, 1980, 4638, 1962, [3749, 6756], 103, 8013, 102, 1403, 3189, 5878, 1469, 1961, 4638, 3301, 1351, 812, 8024, 5018, 671, 6381, 1962, [3136, 5509], 103, 8013, 102, 1220, 6773, 4636, 674, 4493, 5635, 677, 1283, 674, 8024, 8450, 8118, 677, 6820, 3221, 679, 677, 8020, 122, 8021, 1962, [4906, 2825], 103, 8013, 102, 2802, 3647, 1305, 2799, 5838, 2400, 2843, 6624, 1071, 7942, 7032, 2797, 3366, 4638, 1894, 1070, 1963, 791, 678, 1767, 1963, 862, 8043, 1962, [1092, 752], 103, 8013, 102, 5865, 1399, 4638, 4684, 2357, 5384, 7351, 3862, 2284, 3295, 6901, 1525, 1126, 702, 1744, 2157, 4638, 751, 1932, 8043, 1962, [3180, 3952], 103, 8013, 102, 8128, 2259, 4958, 1995, 1777, 7556, 7599, 6756, 6878, 2154, 8024, 6821, 816, 752, 5314, 2769, 812, 784, 720, 1423, 1355, 8043, 1962, [1744, 7354], 103, 8013, 102, 1920, 4669, 671, 4275, 5344, 5682, 8024, 2900, 3144, 3221, 1963, 862, 976, 1168, 1158, 3173, 7770, 4638, 8043, 1962, [5500, 4873], 103, 8013, 102, 1963, 862, 4692, 2521, 1093, 3333, 3136, 5509, 744, 3119, 6589, 7309, 7579, 8043, 1962, [1093, 689], 103, 8013, 102, 4385, 1762, 517, 4959, 6632, 4125, 5296, 518, 4638, 4868, 1690, 4696, 4638, 6820, 4294, 1166, 4638, 1914, 8024, 6443, 6820, 6381, 2533, 7213, 5682, 3324, 2797, 4638, 3198, 807, 1962, [4510, 4993], 103, 8013, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]

[features.input_ids] in original LM-BFF's code: features.input_ids: [0, 118, 56, 7, 356, 409, 111, 42, 21, 9069, 11522, 479, 85, 21, 50264, 4, 2, 405, 128, 29, 101, 18443, 8, 8347, 605, 15673, 15, 2078, 2156, 53, 14, 128, 29, 5063, 2198, 36122, 4226, 2156, 3486, 473, 24, 2916, 5, 10603, 9, 5, 1569, 128, 29, 31083, 14186, 479, 85, 21, 6587, 4, 2, 605, 718, 876, 16, 10, 16599, 1971, 19, 215, 41, 20407, 14500, 154, 527, 14, 40, 5604, 5, 7283, 8, 7754, 9, 171, 479, 85, 21, 372, 4, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]

It seems that the question is give the model a list. How to fix the code? Thanks a lot !

o0mahan0o commented 3 years ago

../FewCLUE/baselines/models_pytorch/LM-BFF/src/dataset.py

change: self.label_to_word[key] = tokenizer.convert_tokens_to_ids(list(self.label_to_word[key]))

to: self.label_to_word[key] = tokenizer._convert_token_to_id(tokenizer.tokenize(' ' + self.label_to_word[key])[0])