The class BartForClassificationAndGeneration's forward function in bart.py

unose commented 5 months ago

The below code snippets are simplified for highlighting only the control flow by the function execution set_model_mode(), where the first task TASK_CODE_AST_PREDICTION changes the mode to MODEL_MODE_CLS and the next two tasks switch the mode to MODEL_MODE_GEN.

When running the program, the mode MODEL_MODE_GEN for those two tasks worked well. However, the mode MODEL_MODE_CLS configured by the task TASK_CODE_AST_PREDICTION caused the runtime error which was triggered from the forward() function in the class BartForClassificationAndGeneration. In this function, the mode MODEL_MODE_GEN in the IF conditional expression (CODE LINK) only executed the next function forward_gen() without any error.

pre_train.py

def pre_train():
    for task in tasks:
        if task == enums.TASK_CODE_AST_PREDICTION:
            model.set_model_mode(enums.MODEL_MODE_CLS)
        elif task == enums.TASK_MASS:
            model.set_model_mode(enums.MODEL_MODE_GEN)
        elif task == enums.TASK_METHOD_NAME_PREDICTION:
            model.set_model_mode(enums.MODEL_MODE_GEN)
    return model, (code_vocab, ast_vocab, nl_vocab)

bart.py

class BartForClassificationAndGeneration(BartForConditionalGeneration):
    def forward():
        if self.mode == enums.MODEL_MODE_GEN:
            return self.forward_gen(input_ids=input_ids,..)
        else:
            raise ValueError

Environment:

ubuntu 22.04.4 LTS, python 3.8.17, and conda 23.5.2

Command:

python main.py --do-pre-train --pre-train-tasks cap --batch-size 16 --eval-batch-size 32 --cuda-visible-devices 0 --fp16 --model-name pre_train

Error message:

Running configurations initialized successfully
----------------------------------------------------------------------------------------------------
Start pre-training task: cap
  0%|          | 0/1770 [00:00<?, ?it/s]Traceback (most recent call last):
  File "main.py", line 100, in <module>
    main(main_args)
  File "main.py", line 22, in main
    model, vocab = pre_train(args=args)
  File "/home/myoungkyu@unomaha.edu/Documents/0-research-spt-code/spt-code/sources/pre_train.py", line 218, in pre_train
    cap_result = trainer.train()
  File "/my-conda-path/envs/spt-code/lib/python3.8/site-packages/transformers/trainer.py", line 1624, in train
    return inner_training_loop(
  File "/my-conda-path/envs/spt-code/lib/python3.8/site-packages/transformers/trainer.py", line 1961, in _inner_training_loop
    tr_loss_step = self.training_step(model, inputs)
  File "/my-conda-path/envs/spt-code/lib/python3.8/site-packages/transformers/trainer.py", line 2902, in training_step
    loss = self.compute_loss(model, inputs)
  File "/my-conda-path/envs/spt-code/lib/python3.8/site-packages/transformers/trainer.py", line 2925, in compute_loss
    outputs = model(**inputs)
  File "/my-conda-path/envs/spt-code/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/my-conda-path/envs/spt-code/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "/my-conda-path/envs/spt-code/lib/python3.8/site-packages/accelerate/utils/operations.py", line 822, in forward
    return model_forward(*args, **kwargs)
  File "/my-conda-path/envs/spt-code/lib/python3.8/site-packages/accelerate/utils/operations.py", line 810, in __call__
    return convert_to_fp32(self.model_forward(*args, **kwargs))
  File "/my-conda-path/envs/spt-code/lib/python3.8/site-packages/torch/amp/autocast_mode.py", line 16, in decorate_autocast
    return func(*args, **kwargs)
  File "/home/myoungkyu@unomaha.edu/Documents/0-research-spt-code/spt-code/sources/models/bart.py", line 85, in forward
    raise ValueError
ValueError

As shown below, there is only one location in the source files to set the model mode MODEL_MODE_CLS and others use MODEL_MODE_GEN through the function set_model_mode().

/my-program-path/sources $ grep -r set_model_mode . --include=*.py
./downstream_tasks/completion.py:    model.set_model_mode(enums.MODEL_MODE_GEN)
./downstream_tasks/translation.py:    model.set_model_mode(enums.MODEL_MODE_GEN)
./downstream_tasks/summarization.py:    model.set_model_mode(enums.MODEL_MODE_GEN)
./downstream_tasks/search.py:    model.set_model_mode(enums.MODEL_MODE_GEN)
./downstream_tasks/bug_fix.py:    model.set_model_mode(enums.MODEL_MODE_GEN)
./models/bart.py:            self.set_model_mode(mode)
./models/bart.py:    def set_model_mode(self, mode):
./pre_train.py:            model.set_model_mode(enums.MODEL_MODE_CLS)
./pre_train.py:            model.set_model_mode(enums.MODEL_MODE_GEN)
./pre_train.py:            model.set_model_mode(enums.MODEL_MODE_GEN)

I want to understand the case that TASK_CODE_AST_PREDICTION changes the mode to MODEL_MODE_CLS in pre_train.py (line 153). I wonder if there was something I missed. The other 2 pre-training tasks worked well.

NougatCA commented 4 months ago

Hi @unose ,

I'm very sorry, I think I made the mistake of uploading a later version I was using which removed the "CLS" branch from the forward function of the model. I will upload the updated version as soon as possible.

unose commented 4 months ago

Thank you so much for your consideration.

NougatCA / SPT-Code