Closed antonio-mastropaolo closed 1 year ago
Hi there, by setting encoding['decoder_input_ids'] = encoding['input_ids'].clone()
, we also feed the text prompt to decoder to better provide the prefix contexts for the models. We find that this is very helpful for CodeT5+ models >=2B, as these models have a deep decoder initilized from frozen GPT-style LLMs, doing this can have a better compatibility with the default behaviours of GPT models. Noth that for CodeT5+ 220M and 770M, they do not need such additional prefix prompts as they are pretrained from scratch.
@yuewang-cuhk Crystal clear! Many thanks :)
Hi all, I was looking at the code released to generate predictions with CodeT5+
and I was wondering what would it be the difference if we factor out the following instruction
encoding['decoder_input_ids'] = encoding['input_ids'].clone()
? What changes under the hood of the model?Thanks in advance for any help you can provide on this.