salesforce / CodeT5

Home of CodeT5: Open Code LLMs for Code Understanding and Generation
https://arxiv.org/abs/2305.07922
BSD 3-Clause "New" or "Revised" License
2.74k stars 401 forks source link

CodeT5 with CodeGen encoder and XGen or CodeGen2.5 decoder #129

Open sgaseretto opened 1 year ago

sgaseretto commented 1 year ago

Not an issue, but since we don't have the discussions section in this repo, I wanted to ask if would something like this be achievable? Both CodeT5 and CodeT5+ were trained with a context-length of 512 tokens. Having something like the context length of XGen would be amazing to work with large codebases and to build more complex coding agents to aid in software development.