Codium-ai / pr-agent

🚀CodiumAI PR-Agent: An AI-Powered 🤖 Tool for Automated Pull Request Analysis, Feedback, Suggestions and More! 💻🔍
Apache License 2.0
5.63k stars 519 forks source link

PR agent cannot detect simple "undefined variable" error if the PR changes are bigger #1186

Open chuanran opened 2 weeks ago

chuanran commented 2 weeks ago

I am using PR Agent powered by llama-3-1-70b-instruct model (understand it's not best model for coding compared with GPT4, etc. but this is the best one I can use in my project for now) to review PRs.

If my PR only contains 1 file change and the file has some undefined variable (notice this is some powershell script, so I cannot use linter like pylint to detect such undefined variable issue), it can easily figure out this bug, and point it out like following:

Screenshot 2024-08-28 at 10 40 45 PM

However, if the PR contains like 5-6 files or more files changes, it could not detect that simple "undefined variable" bug at all, I understand LLM is still a probabilistic model so is not guaranteed to work every time , but even though I have tried for lots of times, it still cannot detect this simple bug if the PR changes are bigger.

I tried the following to resolve the issue but still in vain:

However, it still cannot detect that simple "undefined variable" error. Is there anything I have missed? or any suggestions that could help handle such case better? thanks in advance.

mrT23 commented 2 weeks ago

use better models

chuanran commented 2 weeks ago

use better models

thanks @mrT23 unfortunately I cannot use better model due to compliance/cost perspective for now. any suggestions from you that I can make some configuration changes in configuration.toml or some other parameter tuning to make the results better?

As you know even if I use GPT 4, I believe some tuning or better prompts could make results better?

mrT23 commented 2 weeks ago

PR code is maybe the most difficult "code" there is - It spans over many files, has limited and cropped context, and even the code itself is presented in a non-standard -+ hunk diff way, which is hard to digest.

Hence, it needs good code models. Our prompts are excellent, and work well with good code models.

You can use pr-agent pro, where we take care of everything for you, including models. https://pr-agent-docs.codium.ai/overview/pr_agent_pro/ We use a combination of gpt4 and sonnet-3.5