microsoft / CodeBERT

CodeBERT
MIT License
2.25k stars 455 forks source link

Question about code generation #250

Open Yangget opened 1 year ago

Yangget commented 1 year ago

Hello, author.

I use your script and tutorial to fine-tune on the concode dataset, but I get the same error every time. Do you know how to solve it?

# python run.py \
    --do_train \
    --do_eval \
    --model_name_or_path microsoft/unixcoder-base \
    --train_filename ../dataset/concode/train.json \
    --dev_filename ../dataset/concode/dev.json \
    --output_dir saved_models \
    --max_source_length 350 \
    --max_target_length 150 \
    --beam_size 3 \
    --train_batch_size 16 \
    --eval_batch_size 16 \
    --learning_rate 5e-5 \
    --gradient_accumulation_steps 1 \
    --num_train_epochs 30 > > > > > > > > > > > > > >
04/15/2023 04:34:22 - INFO - __main__ -   device: cuda, n_gpu: 4
04/15/2023 04:34:31 - INFO - __main__ -   Training/evaluation parameters Namespace(adam_epsilon=1e-08, beam_size=3, dev_filename='../dataset/concode/dev.json', device=device(type='cuda'), do_eval=True, do_test=False, do_train=True, eval_batch_size=16, gradient_accumulation_steps=1, learning_rate=5e-05, max_grad_norm=1.0, max_source_length=350, max_target_length=150, model_name_or_path='microsoft/unixcoder-base', n_gpu=4, no_cuda=False, num_train_epochs=30, output_dir='saved_models', seed=42, test_filename=None, train_batch_size=16, train_filename='../dataset/concode/train.json', weight_decay=0.0)
04/15/2023 04:34:36 - INFO - __main__ -   *** Example ***
04/15/2023 04:34:36 - INFO - __main__ -   idx: 0
04/15/2023 04:34:36 - INFO - __main__ -   source_tokens: ['<s>', '<encoder-decoder>', '</s>', 'check', '_if', '_details', '_are', '_parsed', '_.', '_con', 'code', '_', 'field', '_', 'sep', '_Container', '_parent', '_con', 'code', '_', 'elem', '_', 'sep', '_boolean', '_is', 'Parsed', '_con', 'code', '_', 'elem', '_', 'sep', '_long', '_offset', '_con', 'code', '_', 'elem', '_', 'sep', '_long', '_content', 'StartPosition', '_con', 'code', '_', 'elem', '_', 'sep', '_ByteBuffer', '_dead', 'Bytes', '_con', 'code', '_', 'elem', '_', 'sep', '_boolean', '_is', 'Read', '_con', 'code', '_', 'elem', '_', 'sep', '_long', '_mem', 'Map', 'Size', '_con', 'code', '_', 'elem', '_', 'sep', '_Logger', '_LOG', '_con', 'code', '_', 'elem', '_', 'sep', '_byte', '[]', '_user', 'Type', '_con', 'code', '_', 'elem', '_', 'sep', '_String', '_type', '_con', 'code', '_', 'elem', '_', 'sep', '_ByteBuffer', '_content', '_con', 'code', '_', 'elem', '_', 'sep', '_File', 'Channel', '_file', 'Channel', '_con', 'code', '_', 'field', '_', 'sep', '_Container', '_getParent', '_con', 'code', '_', 'elem', '_', 'sep', '_byte', '[]', '_getUser', 'Type', '_con', 'code', '_', 'elem', '_', 'sep', '_void', '_read', 'Content', '_con', 'code', '_', 'elem', '_', 'sep', '_long', '_get', 'Offset', '_con', 'code', '_', 'elem', '_', 'sep', '_long', '_getContent', 'Size', '_con', 'code', '_', 'elem', '_', 'sep', '_void', '_getContent', '_con', 'code', '_', 'elem', '_', 'sep', '_void', '_set', 'Dead', 'Bytes', '_con', 'code', '_', 'elem', '_', 'sep', '_void', '_parse', '_con', 'code', '_', 'elem', '_', 'sep', '_void', '_get', 'Header', '_con', 'code', '_', 'elem', '_', 'sep', '_long', '_getSize', '_con', 'code', '_', 'elem', '_', 'sep', '_void', '_parse', 'Details', '_con', 'code', '_', 'elem', '_', 'sep', '_String', '_getType', '_con', 'code', '_', 'elem', '_', 'sep', '_void', '__', 'parse', 'Details', '_con', 'code', '_', 'elem', '_', 'sep', '_String', '_getPath', '_con', 'code', '_', 'elem', '_', 'sep', '_boolean', '_verify', '_con', 'code', '_', 'elem', '_', 'sep', '_void', '_set', 'Parent', '_con', 'code', '_', 'elem', '_', 'sep', '_void', '_get', 'Box', '_con', 'code', '_', 'elem', '_', 'sep', '_boolean', '_is', 'Small', 'Box', '<mask0>', '</s>']
04/15/2023 04:34:36 - INFO - __main__ -   source_ids: 0 5 2 1471 462 6768 1147 5654 746 549 780 181 1372 181 7421 10976 2201 549 780 181 4641 181 7421 2116 555 13885 549 780 181 4641 181 7421 1534 1805 549 780 181 4641 181 7421 1534 2264 31713 549 780 181 4641 181 7421 16980 10168 2240 549 780 181 4641 181 7421 2116 555 1616 549 780 181 4641 181 7421 1534 1835 1281 939 549 780 181 4641 181 7421 10641 5610 549 780 181 4641 181 7421 2134 1039 1695 641 549 780 181 4641 181 7421 1167 889 549 780 181 4641 181 7421 16980 2264 549 780 181 4641 181 7421 2536 3267 1012 3267 549 780 181 1372 181 7421 10976 19354 549 780 181 4641 181 7421 2134 1039 25533 641 549 780 181 4641 181 7421 723 1557 1646 549 780 181 4641 181 7421 1534 744 1884 549 780 181 4641 181 7421 1534 31482 939 549 780 181 4641 181 7421 723 31482 549 780 181 4641 181 7421 723 827 10099 2240 549 780 181 4641 181 7421 723 2467 549 780 181 4641 181 7421 723 744 1764 549 780 181 4641 181 7421 1534 34727 549 780 181 4641 181 7421 723 2467 5173 549 780 181 4641 181 7421 1167 15866 549 780 181 4641 181 7421 723 623 1783 5173 549 780 181 4641 181 7421 1167 33237 549 780 181 4641 181 7421 2116 5864 549 780 181 4641 181 7421 723 827 2645 549 780 181 4641 181 7421 723 744 1903 549 780 181 4641 181 7421 2116 555 8088 1903 19 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
04/15/2023 04:34:36 - INFO - __main__ -   target_tokens: ['<mask0>', 'boolean', '_function', '_(', '_)', '_{', '_return', '_is', 'Parsed', '_;', '_}', '</s>']
04/15/2023 04:34:36 - INFO - __main__ -   target_ids: 19 3763 603 400 743 399 483 555 13885 2476 425 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
04/15/2023 04:34:36 - INFO - __main__ -   *** Example ***
04/15/2023 04:34:36 - INFO - __main__ -   idx: 1
04/15/2023 04:34:36 - INFO - __main__ -   source_tokens: ['<s>', '<encoder-decoder>', '</s>', 'answer', '_the', '_library', '_file', '_defining', '_the', '_library', '_containing', '_the', '_compilation', '_unit', '_to', '_be', '_indexed', '_or', '_null', '_if', '_the', '_library', '_is', '_not', '_on', '_disk', '_con', 'code', '_', 'field', '_', 'sep', '_Index', 'Store', '_index', 'Store', '_con', 'code', '_', 'elem', '_', 'sep', '_Index', 'Performance', 'Recorder', '_performance', 'Recorder', '_con', 'code', '_', 'elem', '_', 'sep', '_D', 'art', 'Unit', '_unit', '_con', 'code', '_', 'elem', '_', 'sep', '_Compilation', 'Unit', '_compilation', 'Unit', '_con', 'code', '_', 'elem', '_', 'sep', '_Resource', '_resource', '_con', 'code', '_', 'elem', '_', 'sep', '_File', '_library', 'File', '_con', 'code', '_', 'field', '_', 'sep', '_boolean', '_remove', 'When', 'Resource', 'Removed', '_con', 'code', '_', 'elem', '_', 'sep', '_Compilation', 'Unit', '_get', 'CompilationUnit', '_con', 'code', '_', 'elem', '_', 'sep', '_boolean', '_is', 'Query', '_con', 'code', '_', 'elem', '_', 'sep', '_String', '_toString', '_con', 'code', '_', 'elem', '_', 'sep', '_void', '_perform', 'Operation', '<mask0>', '</s>']
04/15/2023 04:34:36 - INFO - __main__ -   source_ids: 0 5 2 7856 448 7971 1012 25118 448 7971 4275 448 16703 5108 508 661 15907 872 700 462 448 7971 555 800 854 8236 549 780 181 1372 181 7421 6257 2767 1442 2767 549 780 181 4641 181 7421 6257 18643 17391 12217 17391 549 780 181 4641 181 7421 614 605 2762 5108 549 780 181 4641 181 7421 43024 2762 16703 2762 549 780 181 4641 181 7421 7606 2377 549 780 181 4641 181 7421 2536 7971 956 549 780 181 1372 181 7421 2116 3033 7422 1755 12070 549 780 181 4641 181 7421 43024 2762 744 36587 549 780 181 4641 181 7421 2116 555 1538 549 780 181 4641 181 7421 1167 14696 549 780 181 4641 181 7421 723 4729 2783 19 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
04/15/2023 04:34:36 - INFO - __main__ -   target_tokens: ['<mask0>', 'File', '_function', '_(', '_)', '_{', '_return', '_library', 'File', '_;', '_}', '</s>']
04/15/2023 04:34:36 - INFO - __main__ -   target_ids: 19 956 603 400 743 399 483 7971 956 2476 425 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
04/15/2023 04:34:36 - INFO - __main__ -   *** Example ***
04/15/2023 04:34:36 - INFO - __main__ -   idx: 2
04/15/2023 04:34:36 - INFO - __main__ -   source_tokens: ['<s>', '<encoder-decoder>', '</s>', 'this', '_method', '_deletes', '_index', '_files', '_of', '_the', '_@', 'link', 'plain', '_index', 'commit', '_for', '_the', '_specified', '_generation', '_number', '_.', '_con', 'code', '_', 'field', '_', 'sep', '_Logger', '_log', '_con', 'code', '_', 'field', '_', 'sep', '_void', '_delete', 'Non', 'Snapshot', 'Index', 'Files', '_con', 'code', '_', 'elem', '_', 'sep', '_Map', '<', 'String', ',', 'Integer', '>', '_build', 'Ref', 'Counts', '<mask0>', '</s>']
04/15/2023 04:34:36 - INFO - __main__ -   source_ids: 0 5 2 490 1454 19028 1442 2966 595 448 890 1378 7687 1442 6140 563 448 2314 13490 1635 746 549 780 181 1372 181 7421 10641 1592 549 780 181 1372 181 7421 723 2821 3579 7597 1052 3765 549 780 181 4641 181 7421 4595 146 684 130 3215 148 3300 1725 13656 19 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
04/15/2023 04:34:36 - INFO - __main__ -   target_tokens: ['<mask0>', 'void', '_function', '_(', '_Directory', '_arg', '0', '_,', '_Collection', '_<', '_Snapshot', 'MetaData', '_>', '_arg', '1', '_,', '_long', '_arg', '2', '_)', '_{', '_List', '_<', '_Index', 'Commit', '_>', '_loc', '0', '_=', '_Directory', 'Reader', '_.', '_list', 'Comm', 'its', '_(', '_arg', '0', '_)', '_;', '_Map', '_<', '_String', '_,', '_Integer', '_>', '_loc', '1', '_=', '_build', 'Ref', 'Counts', '_(', '_arg', '1', '_,', '_loc', '0', '_)', '_;', '_for', '_(', '_Index', 'Commit', '_loc', '2', '_:', '_loc', '0', '_)', '_{', '_if', '_(', '_loc', '2', '_.', '_get', 'Generation', '_(', '_)', '_==', '_arg', '2', '_)', '_{', '_delete', 'Index', 'Files', '_(', '_arg', '0', '_,', '_loc', '1', '_,', '_loc', '2', '_)', '_;', '_break', '_;', '_}', '_}', '_}', '</s>']
04/15/2023 04:34:36 - INFO - __main__ -   target_ids: 19 895 603 400 11227 1238 134 2019 7079 517 29055 12247 711 1238 135 2019 1534 1238 136 743 399 2068 517 6257 8455 711 4893 134 385 11227 2692 746 1182 10268 1204 400 1238 134 743 2476 4595 517 1167 2019 5325 711 4893 135 385 3300 1725 13656 400 1238 135 2019 4893 134 743 2476 563 400 6257 8455 4893 136 545 4893 134 743 399 462 400 4893 136 746 744 13446 400 743 550 1238 136 743 399 2821 1052 3765 400 1238 134 2019 4893 135 2019 4893 136 743 2476 1127 2476 425 425 425 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
04/15/2023 04:34:36 - INFO - __main__ -   *** Example ***
04/15/2023 04:34:36 - INFO - __main__ -   idx: 3
04/15/2023 04:34:36 - INFO - __main__ -   source_tokens: ['<s>', '<encoder-decoder>', '</s>', 'do', '_n', "'t", '_use', '_this', '_.', '_no', '_,', '_really', '_,', '_do', '_n', "'t", '_use', '_this', '_.', '_you', '_already', '_have', '_an', '_authentication', 'token', '_with', '_org', '.', 'apache', '.', 'accum', 'ulo', '.', 'core', '.', 'client', '.', 'map', 'reduce', '.', 'lib', '.', 'impl', '.', 'configurator', 'base', '_#', 'get', 'authentication', 'token', '_class', '_,', '_configuration', '_.', '_you', '_do', '_n', "'t", '_need', '_to', '_construct', '_it', '_your', 'self', '_.', '_gets', '_the', '_password', '_from', '_the', '_configuration', '_.', '_warning', '_:', '_the', '_password', '_is', '_stored', '_in', '_the', '_configuration', '_and', '_shared', '_with', '_all', '_map', 'reduce', '_tasks', '_;', '_it', '_is', '_base', '64', '_encoded', '_to', '_provide', '_a', '_charset', '_safe', '_conversion', '_to', '_a', '_string', '_,', '_and', '_is', '_not', '_intended', '_to', '_be', '_secure', '_.', '_con', 'code', '_', 'field', '_', 'sep', '_Place', 'Holder', '_place', 'Holder', '_con', 'code', '_', 'field', '_', 'sep', '_String', '_get', 'Principal', '_con', 'code', '_', 'elem', '_', 'sep', '_void', '_set', 'LogLevel', '_con', 'code', '_', 'elem', '_', 'sep', '_Level', '_get', 'LogLevel', '_con', 'code', '_', 'elem', '_', 'sep', '_Boolean', '_is', 'Connector', 'Info', 'Set', '_con', 'code', '_', 'elem', '_', 'sep', '_String', '_getToken', 'Class', '_con', 'code', '_', 'elem', '_', 'sep', '_void', '_set', 'Z', 'oo', 'Keeper', 'Instance', '_con', 'code', '_', 'elem', '_', 'sep', '_void', '_set', 'Mock', 'Instance', '_con', 'code', '_', 'elem', '_', 'sep', '_Instance', '_getInstance', '_con', 'code', '_', 'elem', '_', 'sep', '_String', '_enum', 'To', 'Conf', 'Key', '_con', 'code', '_', 'elem', '_', 'sep', '_void', '_set', 'Connector', 'Info', '<mask0>', '</s>']
04/15/2023 04:34:36 - INFO - __main__ -   source_ids: 0 5 2 1440 416 1340 1393 547 746 1375 2019 9339 2019 1000 416 1340 1393 547 746 2713 2916 1577 817 11367 1363 918 6584 132 10521 132 14938 18976 132 1853 132 1590 132 1051 10137 132 1930 132 5218 132 49243 1118 830 459 16570 1363 1503 2019 4045 746 2713 1000 416 1340 2105 508 8669 835 4862 1733 746 8180 448 5724 1029 448 4045 746 5893 545 448 5724 555 6788 488 448 4045 706 6880 918 1345 1910 10137 10143 2476 835 555 1712 848 7040 508 6634 434 8862 7248 7661 508 434 724 2019 706 555 800 17482 508 661 17880 746 549 780 181 1372 181 7421 15454 8162 5002 8162 549 780 181 1372 181 7421 1167 744 13764 549 780 181 4641 181 7421 723 827 17614 549 780 181 4641 181 7421 11690 744 17614 549 780 181 4641 181 7421 6845 555 12479 986 815 549 780 181 4641 181 7421 1167 39081 1128 549 780 181 4641 181 7421 723 827 176 2799 26456 1562 549 780 181 4641 181 7421 723 827 3002 1562 549 780 181 4641 181 7421 10631 17183 549 780 181 4641 181 7421 1167 4020 687 4579 927 549 780 181 4641 181 7421 723 827 12479 986 19 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
04/15/2023 04:34:36 - INFO - __main__ -   target_tokens: ['<mask0>', 'byte', '_[', '_]', '_function', '_(', '_Class', '_<', '_?', '_>', '_arg', '0', '_,', '_Configuration', '_arg', '1', '_)', '_{', '_return', '_Authentication', 'Token', 'Serializer', '_.', '_serialize', '_(', '_org', '_.', '_a', 'pache', '_.', '_accum', 'ulo', '_.', '_core', '_.', '_client', '_.', '_map', 'reduce', '_.', '_lib', '_.', '_impl', '_.', '_Config', 'urator', 'Base', '_.', '_get', 'Authentication', 'Token', '_(', '_arg', '0', '_,', '_arg', '1', '_)', '_)', '_;', '_}', '</s>']
04/15/2023 04:34:36 - INFO - __main__ -   target_ids: 19 1106 626 2406 603 400 4807 517 999 711 1238 134 2019 9337 1238 135 743 399 483 19810 1367 7570 746 11160 400 6584 746 434 23566 746 12027 18976 746 6152 746 2234 746 1910 10137 746 3295 746 12606 746 4555 19726 1737 746 744 9832 1367 400 1238 134 2019 1238 135 743 743 2476 425 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
04/15/2023 04:34:36 - INFO - __main__ -   *** Example ***
04/15/2023 04:34:36 - INFO - __main__ -   idx: 4
04/15/2023 04:34:36 - INFO - __main__ -   source_tokens: ['<s>', '<encoder-decoder>', '</s>', 'force', '_the', '_event', 'bus', '_from', '_am', 'bar', 'i', 'event', 'publisher', '_to', '_be', '_serial', 'and', '_synchronous', '_.', '_con', 'code', '_', 'field', '_', 'sep', '_Place', 'Holder', '_place', 'Holder', '_con', 'code', '_', 'field', '_', 'sep', '_void', '_register', 'Alert', 'Listeners', '_con', 'code', '_', 'elem', '_', 'sep', '_Event', 'Bus', '_synchronize', 'Alert', 'Event', 'Publisher', '_con', 'code', '_', 'elem', '_', 'sep', '_void', '_replace', 'Event', 'Bus', '_con', 'code', '_', 'elem', '_', 'sep', '_void', '_register', 'Amb', 'ari', 'Listeners', '<mask0>', '</s>']
04/15/2023 04:34:36 - INFO - __main__ -   source_ids: 0 5 2 5104 448 1488 3083 1029 4000 1829 191 1357 26072 508 661 9706 501 18791 746 549 780 181 1372 181 7421 15454 8162 5002 8162 549 780 181 1372 181 7421 723 2882 9203 8396 549 780 181 4641 181 7421 3916 7663 35717 9203 1089 17883 549 780 181 4641 181 7421 723 4126 1089 7663 549 780 181 4641 181 7421 723 2882 19443 11215 8396 19 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
04/15/2023 04:34:36 - INFO - __main__ -   target_tokens: ['<mask0>', 'void', '_function', '_(', '_Binder', '_arg', '0', '_)', '_{', '_Event', 'Bus', '_loc', '0', '_=', '_new', '_Event', 'Bus', '_(', '_)', '_;', '_A', 'mb', 'ari', 'Event', 'Publisher', '_loc', '1', '_=', '_new', '_A', 'mb', 'ari', 'Event', 'Publisher', '_(', '_)', '_;', '_replace', 'Event', 'Bus', '_(', '_A', 'mb', 'ari', 'Event', 'Publisher', '_.', '_class', '_,', '_loc', '1', '_,', '_loc', '0', '_)', '_;', '_arg', '0', '_.', '_bind', '_(', '_A', 'mb', 'ari', 'Event', 'Publisher', '_.', '_class', '_)', '_.', '_to', 'Instance', '_(', '_loc', '1', '_)', '_;', '_}', '</s>']
04/15/2023 04:34:36 - INFO - __main__ -   target_ids: 19 895 603 400 43863 1238 134 743 399 3916 7663 4893 134 385 579 3916 7663 400 743 2476 553 1228 11215 1089 17883 4893 135 385 579 553 1228 11215 1089 17883 400 743 2476 4126 1089 7663 400 553 1228 11215 1089 17883 746 1503 2019 4893 135 2019 4893 134 743 2476 1238 134 746 6679 400 553 1228 11215 1089 17883 746 1503 743 746 508 1562 400 4893 135 743 2476 425 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
/opt/conda/lib/python3.7/site-packages/transformers/optimization.py:395: FutureWarning: This implementation of AdamW is deprecated and will be removed in a future version. Use the PyTorch implementation torch.optim.AdamW instead, or set `no_deprecation_warning=True` to disable this warning
  FutureWarning,
04/15/2023 04:37:22 - INFO - __main__ -   ***** Running training *****
04/15/2023 04:37:22 - INFO - __main__ -     Num examples = 100000
04/15/2023 04:37:22 - INFO - __main__ -     Batch size = 16
04/15/2023 04:37:22 - INFO - __main__ -     Num epoch = 30
/opt/conda/lib/python3.7/site-packages/torch/nn/parallel/_functions.py:68: UserWarning: Was asked to gather along dimension 0, but all input tensors were scalars; will instead unsqueeze and return a vector.
  warnings.warn('Was asked to gather along dimension 0, but all '
04/15/2023 04:38:28 - INFO - __main__ -   epoch 0 step 100 loss 3.0704
04/15/2023 04:39:24 - INFO - __main__ -   epoch 0 step 200 loss 2.4808
04/15/2023 04:40:21 - INFO - __main__ -   epoch 0 step 300 loss 1.846
04/15/2023 04:41:17 - INFO - __main__ -   epoch 0 step 400 loss 1.4038
04/15/2023 04:42:13 - INFO - __main__ -   epoch 0 step 500 loss 1.1994
04/15/2023 04:43:10 - INFO - __main__ -   epoch 0 step 600 loss 1.0601
04/15/2023 04:44:06 - INFO - __main__ -   epoch 0 step 700 loss 0.9902
04/15/2023 04:45:02 - INFO - __main__ -   epoch 0 step 800 loss 0.9025
04/15/2023 04:45:59 - INFO - __main__ -   epoch 0 step 900 loss 0.831
04/15/2023 04:46:55 - INFO - __main__ -   epoch 0 step 1000 loss 0.7655
04/15/2023 04:47:51 - INFO - __main__ -   epoch 0 step 1100 loss 0.7568
04/15/2023 04:48:48 - INFO - __main__ -   epoch 0 step 1200 loss 0.743
04/15/2023 04:49:44 - INFO - __main__ -   epoch 0 step 1300 loss 0.7144
04/15/2023 04:50:40 - INFO - __main__ -   epoch 0 step 1400 loss 0.7082
04/15/2023 04:51:37 - INFO - __main__ -   epoch 0 step 1500 loss 0.7097
04/15/2023 04:52:33 - INFO - __main__ -   epoch 0 step 1600 loss 0.7053
04/15/2023 04:53:29 - INFO - __main__ -   epoch 0 step 1700 loss 0.6702
04/15/2023 04:54:26 - INFO - __main__ -   epoch 0 step 1800 loss 0.6807
04/15/2023 04:55:22 - INFO - __main__ -   epoch 0 step 1900 loss 0.6497
04/15/2023 04:56:18 - INFO - __main__ -   epoch 0 step 2000 loss 0.6553
04/15/2023 04:57:15 - INFO - __main__ -   epoch 0 step 2100 loss 0.6442
04/15/2023 04:58:11 - INFO - __main__ -   epoch 0 step 2200 loss 0.6638
04/15/2023 04:59:07 - INFO - __main__ -   epoch 0 step 2300 loss 0.6162
04/15/2023 05:00:04 - INFO - __main__ -   epoch 0 step 2400 loss 0.6395
04/15/2023 05:01:00 - INFO - __main__ -   epoch 0 step 2500 loss 0.6131
04/15/2023 05:01:56 - INFO - __main__ -   epoch 0 step 2600 loss 0.6195
04/15/2023 05:02:53 - INFO - __main__ -   epoch 0 step 2700 loss 0.614
04/15/2023 05:03:49 - INFO - __main__ -   epoch 0 step 2800 loss 0.59
04/15/2023 05:04:45 - INFO - __main__ -   epoch 0 step 2900 loss 0.577
04/15/2023 05:05:42 - INFO - __main__ -   epoch 0 step 3000 loss 0.5565
04/15/2023 05:06:38 - INFO - __main__ -   epoch 0 step 3100 loss 0.5919
04/15/2023 05:07:34 - INFO - __main__ -   epoch 0 step 3200 loss 0.5881
04/15/2023 05:08:31 - INFO - __main__ -   epoch 0 step 3300 loss 0.5739
04/15/2023 05:09:28 - INFO - __main__ -   epoch 0 step 3400 loss 0.592
04/15/2023 05:10:24 - INFO - __main__ -   epoch 0 step 3500 loss 0.5853
04/15/2023 05:11:20 - INFO - __main__ -   epoch 0 step 3600 loss 0.5873
04/15/2023 05:12:17 - INFO - __main__ -   epoch 0 step 3700 loss 0.5924
04/15/2023 05:13:13 - INFO - __main__ -   epoch 0 step 3800 loss 0.5547
04/15/2023 05:14:10 - INFO - __main__ -   epoch 0 step 3900 loss 0.5653
04/15/2023 05:15:07 - INFO - __main__ -   epoch 0 step 4000 loss 0.5555
04/15/2023 05:16:03 - INFO - __main__ -   epoch 0 step 4100 loss 0.5862
04/15/2023 05:17:00 - INFO - __main__ -   epoch 0 step 4200 loss 0.5619
04/15/2023 05:17:57 - INFO - __main__ -   epoch 0 step 4300 loss 0.5381
04/15/2023 05:18:54 - INFO - __main__ -   epoch 0 step 4400 loss 0.5597
04/15/2023 05:19:50 - INFO - __main__ -   epoch 0 step 4500 loss 0.578
04/15/2023 05:20:47 - INFO - __main__ -   epoch 0 step 4600 loss 0.5478
04/15/2023 05:21:43 - INFO - __main__ -   epoch 0 step 4700 loss 0.5653
04/15/2023 05:22:39 - INFO - __main__ -   epoch 0 step 4800 loss 0.5579
04/15/2023 05:23:36 - INFO - __main__ -   epoch 0 step 4900 loss 0.5378
04/15/2023 05:24:32 - INFO - __main__ -   epoch 0 step 5000 loss 0.5335
04/15/2023 05:25:29 - INFO - __main__ -   epoch 0 step 5100 loss 0.5547
04/15/2023 05:26:25 - INFO - __main__ -   epoch 0 step 5200 loss 0.5311
04/15/2023 05:27:21 - INFO - __main__ -   epoch 0 step 5300 loss 0.5359
04/15/2023 05:28:18 - INFO - __main__ -   epoch 0 step 5400 loss 0.5367
04/15/2023 05:29:15 - INFO - __main__ -   epoch 0 step 5500 loss 0.5322
04/15/2023 05:30:11 - INFO - __main__ -   epoch 0 step 5600 loss 0.529
04/15/2023 05:31:07 - INFO - __main__ -   epoch 0 step 5700 loss 0.5461
04/15/2023 05:32:04 - INFO - __main__ -   epoch 0 step 5800 loss 0.5278
04/15/2023 05:33:00 - INFO - __main__ -   epoch 0 step 5900 loss 0.5342
04/15/2023 05:33:56 - INFO - __main__ -   epoch 0 step 6000 loss 0.5152
04/15/2023 05:34:53 - INFO - __main__ -   epoch 0 step 6100 loss 0.5192
04/15/2023 05:35:49 - INFO - __main__ -   epoch 0 step 6200 loss 0.5242
04/15/2023 05:36:21 - INFO - __main__ -
***** Running evaluation *****
04/15/2023 05:36:21 - INFO - __main__ -     Num examples = 2000
04/15/2023 05:36:21 - INFO - __main__ -     Batch size = 16
04/15/2023 05:37:01 - INFO - __main__ -     eval_ppl = 1.96214
04/15/2023 05:37:01 - INFO - __main__ -     ********************
/opt/conda/lib/python3.7/site-packages/torch/_tensor.py:575: UserWarning: floor_divide is deprecated, and will be removed in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values.
To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor'). (Triggered internally at  /opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/BinaryOps.cpp:467.)
  return torch.floor_divide(self, other)
Traceback (most recent call last):
  File "run.py", line 421, in <module>
    main()
  File "run.py", line 341, in main
    preds = model(source_ids)
  File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "/opt/conda/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 168, in forward
    outputs = self.parallel_apply(replicas, inputs, kwargs)
  File "/opt/conda/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 178, in parallel_apply
    return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
  File "/opt/conda/lib/python3.7/site-packages/torch/nn/parallel/parallel_apply.py", line 86, in parallel_apply
    output.reraise()
  File "/opt/conda/lib/python3.7/site-packages/torch/_utils.py", line 425, in reraise
    raise self.exc_type(msg)
TypeError: Caught TypeError in replica 0 on device 0.
Original Traceback (most recent call last):
  File "/opt/conda/lib/python3.7/site-packages/torch/nn/parallel/parallel_apply.py", line 61, in _worker
    output = module(*input, **kwargs)
  File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/auto-video-copilot/code-generation/model.py", line 48, in forward
    return self.generate(source_ids)
  File "/home/auto-video-copilot/code-generation/model.py", line 96, in generate
    pred = beam.buildTargetTokens(hyp)[:self.beam_size]
TypeError: 'NoneType' object is not subscriptable
guoday commented 1 year ago

It appears that you've modified something, causing the beam.buildTargetTokens(hyp) to return None. However, as we can see, the function exclusively returns a list. Thus, you need to check the code by yourself.