OpenBMB / ModelCenter

Efficient, Low-Resource, Distributed transformer implementation based on BMTrain
https://modelcenter.readthedocs.io
Apache License 2.0
233 stars 28 forks source link

[BUG] cpm1 finetuning error ---- AttributeError: 'BaseModelOutput' object has no attribute 'index_select' #37

Open pikaqqqqqq opened 1 year ago

pikaqqqqqq commented 1 year ago

Describe the bug

Building prefix dict from the default dictionary ... Loading model from cache /tmp/jieba.cache Building prefix dict from the default dictionary ... Loading model from cache /tmp/jieba.cache Loading model cost 0.642 seconds. Prefix dict has been built successfully. Loading model cost 0.646 seconds. Prefix dict has been built successfully. Traceback (most recent call last): File "/data2/lvyang/ModelCenter-main/examples/cpm1/finetune_cpm1.py", line 203, in main() File "/data2/lvyang/ModelCenter-main/examples/cpm1/finetune_cpm1.py", line 200, in main finetune(args, tokenizer, model, optimizer, lr_scheduler, dataset, verbalizer) File "/data2/lvyang/ModelCenter-main/examples/cpm1/finetune_cpm1.py", line 130, in finetune logits = logits.index_select(dim=-1, index=verbalizer) AttributeError: 'BaseModelOutput' object has no attribute 'index_select' tensor([[[ 24.5469, -16.3125, 12.9609, ..., -26.2656, 1.8779, -25.1406], [ 30.8906, -16.9375, 9.6719, ..., -20.2969, 3.0117, -22.3125], [ 30.8281, -21.5781, 11.2500, ..., -23.3750, 1.5078, -22.7812], ..., [ 10.5469, 25.8438, 4.4414, ..., -10.7578, -8.1484, 11.7344], [ 10.4688, 25.9062, 4.3164, ..., -10.6719, -8.1875, 11.6875], [ 10.3828, 25.9844, 4.1797, ..., -10.5625, -8.2188, 11.6562]],

    [[ 24.5469, -16.3125,  12.9609,  ..., -26.2656,   1.8779, -25.1406],
     [ 30.8906, -16.9375,   9.6719,  ..., -20.2969,   3.0117, -22.3125],
     [ 28.3906, -17.7188,  10.4922,  ..., -23.5625,   2.5723, -22.5000],
     ...,
     [ 11.8125,  29.6406,  -1.5371,  ..., -16.7656, -10.9219,  -5.0391],
     [ 11.6641,  29.7031,  -1.7520,  ..., -16.8594, -10.9609,  -4.9297],
     [ 11.5859,  29.6875,  -1.8467,  ..., -16.8750, -11.0156,  -4.8672]],

    [[ 24.5469, -16.3125,  12.9609,  ..., -26.2656,   1.8779, -25.1406],
     [ 30.8906, -16.9375,   9.6719,  ..., -20.2969,   3.0117, -22.3125],
     [ 24.0312, -19.5781,  10.6172,  ..., -23.1562,   3.0566, -26.5938],
     ...,
     [ 18.1875,  20.6250,  -0.4412,  ..., -20.0469,  -8.6406,  -5.4141],
     [ 18.0156,  20.6406,  -0.5356,  ..., -20.0000,  -8.6562,  -5.3984],
     [ 17.7969,  20.5781,  -0.6631,  ..., -19.9375,  -8.6797,  -5.3828]],

    ...,

    [[ 24.5469, -16.3125,  12.9609,  ..., -26.2656,   1.8779, -25.1406],
     [ 30.8906, -16.9375,   9.6719,  ..., -20.2969,   3.0117, -22.3125],
     [ 22.8906, -17.2656,   9.8203,  ..., -28.9688,   2.4785, -20.5781],
     ...,
     [  3.3633,  15.7969,  -7.3594,  ...,  -3.3105,  -5.4492,  12.5391],
     [  3.3730,  15.7188,  -7.5000,  ...,  -3.2207,  -5.5156,  12.4766],
     [  3.5176,  15.6875,  -7.6172,  ...,  -3.1367,  -5.6133,  12.3281]],

    [[ 24.5469, -16.3125,  12.9609,  ..., -26.2656,   1.8779, -25.1406],
     [ 30.8906, -16.9375,   9.6719,  ..., -20.2969,   3.0117, -22.3125],
     [ 28.4531, -14.2266,  13.3984,  ..., -29.8906,   3.5000, -22.8594],
     ...,
     [ -5.2656,  17.5000,  -1.3281,  ..., -11.0391,  -8.8672,   6.9297],
     [ -5.2812,  17.5156,  -1.3369,  ..., -10.9141,  -8.8828,   6.9883],
     [ -5.1719,  17.5625,  -1.3506,  ..., -10.8281,  -8.8750,   7.1328]],

    [[ 24.5469, -16.3125,  12.9609,  ..., -26.2656,   1.8779, -25.1406],
     [ 30.8906, -16.9375,   9.6719,  ..., -20.2969,   3.0117, -22.3125],
     [ 22.5625, -18.6562,  13.9297,  ..., -26.3594,   3.4219, -16.9531],
     ...,
     [ 26.6719, -16.6719,  13.6797,  ..., -24.8906,   0.3289, -22.3125],
     [ 26.6875, -16.5938,  13.6641,  ..., -24.9062,   0.3262, -22.2031],
     [ 26.6875, -16.5000,  13.6719,  ..., -24.8906,   0.3208, -22.1719]]],
   device='cuda:0', dtype=torch.float16, grad_fn=<MulBackward0>)

<class 'model_center.model.basemodel.BaseModelOutput'> Traceback (most recent call last): File "/data2/lvyang/ModelCenter-main/examples/cpm1/finetune_cpm1.py", line 203, in main() File "/data2/lvyang/ModelCenter-main/examples/cpm1/finetune_cpm1.py", line 200, in main finetune(args, tokenizer, model, optimizer, lr_scheduler, dataset, verbalizer) File "/data2/lvyang/ModelCenter-main/examples/cpm1/finetune_cpm1.py", line 130, in finetune logits = logits.index_select(dim=-1, index=verbalizer) AttributeError: 'BaseModelOutput' object has no attribute 'index_select'

Minimal steps to reproduce

Expected behavior

Screenshots

Environment:

torch 1.10.2+cu111 model_center 0.1.5 bmtrian 0.1.8

finetuning scripts:ModelCenter-main/examples/cpm1/finetun_cpm1.sh code: ModelCenter-main/examples/cpm1/finetune_cpm1.py