From the paper "CODEGEN: AN OPEN LARGE LANGUAGE MODEL FOR CODE WITH MULTI-TURN PROGRAM SYNTHESIS", the architecture of CodeGen follows a standard transformer decoder with left-to-right causal masking. How do you use CodeGen-mono 350M (decoder) to initialize the encoder? As far as I know, there is a slight difference between encoder and decoder.
From the paper "CODEGEN: AN OPEN LARGE LANGUAGE MODEL FOR CODE WITH MULTI-TURN PROGRAM SYNTHESIS", the architecture of CodeGen follows a standard transformer decoder with left-to-right causal masking. How do you use CodeGen-mono 350M (decoder) to initialize the encoder? As far as I know, there is a slight difference between encoder and decoder.