Closed Tejaswgupta closed 6 months ago
we still working on that issue. we have already evaluate codeqwen on SWE-bench here https://qwenlm.github.io/blog/codeqwen1.5/ without much prompt engineering. The performance is better than gpt-3.5-turbo. However, we think it is far away from good.
Is there any sample of any implementation of CodeQwen(or any other Code LLM) that is aimed for PR review. Specifically focusing on code quality and flagging bugs.