multicloudlab / pr-agent-test

0 stars 0 forks source link

debug log for improve #2

Open gyliu513 opened 4 days ago

gyliu513 commented 4 days ago
(pr-agent) gyliu@guangyas-air pr-agent %  cd /Users/gyliu/go/src/github.com/Codium-ai/pr-agent ; /usr/bin/env /Users/gyliu/pr
-agent/bin/python /Users/gyliu/.vscode/extensions/ms-python.debugpy-2024.6.0-darwin-arm64/bundled/libs/debugpy/adapter/../../
debugpy/launcher 49514 -- /Users/gyliu/go/src/github.com/Codium-ai/pr-agent/pr_agent/cli_pip.py 
2024-07-03 11:50:07.547 | INFO     | pr_agent.git_providers.git_provider:get_main_pr_language:220 - No languages detected
2024-07-03 11:50:07.551 | INFO     | pr_agent.tools.pr_code_suggestions:__init__:35 - Setting max_model_tokens to 10000 for PR improve
2024-07-03 11:50:07.554 | INFO     | pr_agent.tools.pr_code_suggestions:_get_is_extended:352 - Extended mode is enabled automatically based on the configuration toggle
2024-07-03 11:50:13.167 | INFO     | pr_agent.tools.pr_code_suggestions:run:74 - Generating code suggestions for PR...
2024-07-03 11:50:13.170 | DEBUG    | pr_agent.tools.pr_code_suggestions:run:77 - Relevant configs
2024-07-03 11:50:15.712 | DEBUG    | pr_agent.algo.pr_processing:retry_with_fallback_models:329 - Generating prediction with gpt-4-turbo-2024-04-09
2024-07-03 11:50:16.029 | INFO     | pr_agent.tools.pr_code_suggestions:_prepare_prediction_extended:360 - Number of PR chunk calls: 1
2024-07-03 11:50:16.030 | DEBUG    | pr_agent.tools.pr_code_suggestions:_prepare_prediction_extended:361 - PR diff:
2024-07-03 11:50:16.037 | DEBUG    | pr_agent.algo.ai_handlers.litellm_ai_handler:chat_completion:139 - Prompts
2024-07-03 11:50:28.864 | DEBUG    | pr_agent.algo.ai_handlers.litellm_ai_handler:chat_completion:160 - 

AI response: code_suggestions:

gyliu513 commented 4 days ago

watsonx

(pr-agent) gyliu@guangyas-air pr-agent % /Users/gyliu/pr-agent/bin/python /Users/gyliu/go/src/github.com/Codium-ai/pr-agent/pr_agent/cli_pip.py 2024-07-03 22:11:18.792 | INFO | pr_agent.git_providers.git_provider:get_main_pr_language:220 - No languages detected 2024-07-03 22:11:18.794 | INFO | pr_agent.tools.pr_code_suggestions:init:35 - Setting max_model_tokens to 10000 for PR improve 2024-07-03 22:11:18.795 | INFO | pr_agent.tools.pr_code_suggestions:_get_is_extended:352 - Extended mode is enabled automatically based on the configuration toggle 2024-07-03 22:11:18.970 | INFO | pr_agent.tools.pr_code_suggestions:run:74 - Generating code suggestions for PR... 2024-07-03 22:11:18.971 | DEBUG | pr_agent.tools.pr_code_suggestions:run:77 - Relevant configs 2024-07-03 22:11:19.364 | DEBUG | pr_agent.algo.pr_processing:retry_with_fallback_models:329 - Generating prediction with watsonx/meta-llama/llama-3-8b-instruct 2024-07-03 22:11:19.669 | INFO | pr_agent.tools.pr_code_suggestions:_prepare_prediction_extended:360 - Number of PR chunk calls: 1 2024-07-03 22:11:19.669 | DEBUG | pr_agent.tools.pr_code_suggestions:_prepare_prediction_extended:361 - PR diff: 2024-07-03 22:11:19.669 | INFO | pr_agent.tools.pr_code_suggestions:_prepare_prediction_extended:362 - ['\n\n## file: \'pr-agent.py\'\n\n@@ -0,0 +1,9 @@\nnew hunk\n1 +# This program prints Hello, world!\n2 +\n3 +printfx(\'Hello, world!\')\n4 +\n5 +for i=0; i<10; i++ {\n6 + printfx("how to build for loop")\n7 +}\n8 +\n9 +# There aare soome typooo'] 2024-07-03 22:11:19.674 | DEBUG | pr_agent.algo.ai_handlers.litellm_ai_handler:chat_completion:155 - Prompts 2024-07-03 22:11:19.674 | INFO | pr_agent.algo.ai_handlers.litellm_ai_handler:chat_completion:158 - System prompt: You are PR-Reviewer, a language model that specializes in suggesting ways to improve for a Pull Request (PR) code. Your task is to provide meaningful and actionable code suggestions, to improve the new code presented in a PR diff.

The format we will use to present the PR code diff:

file: 'src/file1.py'

@@ ... @@ def func1(): new hunk 12 code line1 that remained unchanged in the PR 13 +new hunk code line2 added in the PR 14 code line3 that remained unchanged in the PR old hunk code line1 that remained unchanged in the PR -old hunk code line2 that was removed in the PR code line3 that remained unchanged in the PR

@@ ... @@ def func2(): new hunk ... old hunk ...

file: 'src/file2.py'

...

Specific instructions for generating code suggestions:

The output must be a YAML object equivalent to type $PRCodeSuggestions, according to the following Pydantic definitions:

class CodeSuggestion(BaseModel): relevant_file: str = Field(description="the relevant file full path") language: str = Field(description="the code language of the relevant file") suggestion_content: str = Field(description="an actionable suggestion for meaningfully improving the new code introduced in the PR") existing_code: str = Field(description="a short code snippet, demonstrating the relevant code lines from a 'new hunk' section. It must be without line numbers. Use abbreviations if needed") improved_code: str = Field(description="a new code snippet, that can be used to replace the relevant 'existing_code' lines in 'new hunk' code after applying the suggestion") one_sentence_summary: str = Field(description="a short summary of the suggestion action, in a single sentence. Focus on the 'what'. Be general, and avoid method or variable names.") relevant_lines_start: int = Field(description="The relevant line number, from a 'new hunk' section, where the suggestion starts (inclusive). Should be derived from the hunk line numbers, and correspond to the 'existing code' snippet above") relevant_lines_end: int = Field(description="The relevant line number, from a 'new hunk' section, where the suggestion ends (inclusive). Should be derived from the hunk line numbers, and correspond to the 'existing code' snippet above") label: str = Field(description="a single label for the suggestion, to help the user understand the suggestion type. For example: 'security', 'possible bug', 'possible issue', 'performance', 'enhancement', 'best practice', 'maintainability', etc. Other labels are also allowed")

class PRCodeSuggestions(BaseModel): code_suggestions: List[CodeSuggestion]

Example output:

code_suggestions:
- relevant_file: |
    src/file1.py
  language: |
    python
  suggestion_content: |
    ...
  existing_code: |
    ...
  improved_code: |
    ...
  one_sentence_summary: |
    ...
  relevant_lines_start: 12
  relevant_lines_end: 13
  label: |
    ...

Each YAML output MUST be after a newline, indented, with block scalar indicator ('|'). 2024-07-03 22:11:19.675 | INFO | pr_agent.algo.ai_handlers.litellm_ai_handler:chat_completion:159 - User prompt: PR Info:

Title: 'xxx'

The PR Diff:

file: 'pr-agent.py'

@@ -0,0 +1,9 @@ new hunk 1 +# This program prints Hello, world! 2 + 3 +printfx('Hello, world!') 4 + 5 +for i=0; i<10; i++ { 6 + printfx("how to build for loop") 7 +} 8 + 9 +# There aare soome typooo

Response (should be a valid YAML, and nothing else):


2024-07-03 22:11:21.825 | DEBUG    | pr_agent.algo.ai_handlers.litellm_ai_handler:chat_completion:176 - 
AI response:
code_suggestions:
- relevant_file: |
    src/pr-agent.py
  language: |

2024-07-03 22:11:21.826 | DEBUG    | pr_agent.algo.ai_handlers.litellm_ai_handler:chat_completion:180 - Full_response
2024-07-03 22:11:21.827 | INFO     | pr_agent.algo.ai_handlers.litellm_ai_handler:chat_completion:184 - 
AI response:
code_suggestions:
- relevant_file: |
    src/pr-agent.py
  language: |

2024-07-03 22:11:21.828 | DEBUG    | pr_agent.tools.pr_code_suggestions:_prepare_pr_code_suggestions:248 - Skipping suggestion 1, because it does not contain 'one_sentence_summary':
'{'relevant_file': 'src/pr-agent.py\n', 'language': ''}
2024-07-03 22:11:21.830 | ERROR    | pr_agent.tools.pr_code_suggestions:run:93 - No code suggestions found for PR.
2024-07-03 22:11:21.831 | DEBUG    | pr_agent.tools.pr_code_suggestions:run:95 - PR output
(pr-agent) gyliu@guangyas-air pr-agent % 
gyliu513 commented 4 days ago

OpenAI

(pr-agent) gyliu@guangyas-air pr-agent % /Users/gyliu/pr-agent/bin/python /Users/gyliu/go/src/github.com/Codium-ai/pr-agent/pr_agent/cli_pip.py 2024-07-03 22:29:00.253 | INFO | pr_agent.git_providers.git_provider:get_main_pr_language:220 - No languages detected 2024-07-03 22:29:00.254 | INFO | pr_agent.tools.pr_code_suggestions:init:35 - Setting max_model_tokens to 10000 for PR improve 2024-07-03 22:29:00.255 | INFO | pr_agent.tools.pr_code_suggestions:_get_is_extended:352 - Extended mode is enabled automatically based on the configuration toggle 2024-07-03 22:29:00.403 | INFO | pr_agent.tools.pr_code_suggestions:run:74 - Generating code suggestions for PR... 2024-07-03 22:29:00.404 | DEBUG | pr_agent.tools.pr_code_suggestions:run:77 - Relevant configs 2024-07-03 22:29:00.904 | DEBUG | pr_agent.algo.pr_processing:retry_with_fallback_models:329 - Generating prediction with gpt-4-turbo-2024-04-09 2024-07-03 22:29:01.357 | INFO | pr_agent.tools.pr_code_suggestions:_prepare_prediction_extended:360 - Number of PR chunk calls: 1 2024-07-03 22:29:01.357 | DEBUG | pr_agent.tools.pr_code_suggestions:_prepare_prediction_extended:361 - PR diff: 2024-07-03 22:29:01.357 | INFO | pr_agent.tools.pr_code_suggestions:_prepare_prediction_extended:362 - ['\n\n## file: \'pr-agent.py\'\n\n@@ -0,0 +1,9 @@\nnew hunk\n1 +# This program prints Hello, world!\n2 +\n3 +printfx(\'Hello, world!\')\n4 +\n5 +for i=0; i<10; i++ {\n6 + printfx("how to build for loop")\n7 +}\n8 +\n9 +# There aare soome typooo'] 2024-07-03 22:29:01.362 | DEBUG | pr_agent.algo.ai_handlers.litellm_ai_handler:chat_completion:155 - Prompts 2024-07-03 22:29:01.362 | INFO | pr_agent.algo.ai_handlers.litellm_ai_handler:chat_completion:158 - System prompt: You are PR-Reviewer, a language model that specializes in suggesting ways to improve for a Pull Request (PR) code. Your task is to provide meaningful and actionable code suggestions, to improve the new code presented in a PR diff.

The format we will use to present the PR code diff:

file: 'src/file1.py'

@@ ... @@ def func1(): new hunk 12 code line1 that remained unchanged in the PR 13 +new hunk code line2 added in the PR 14 code line3 that remained unchanged in the PR old hunk code line1 that remained unchanged in the PR -old hunk code line2 that was removed in the PR code line3 that remained unchanged in the PR

@@ ... @@ def func2(): new hunk ... old hunk ...

file: 'src/file2.py'

...

Specific instructions for generating code suggestions:

The output must be a YAML object equivalent to type $PRCodeSuggestions, according to the following Pydantic definitions:

class CodeSuggestion(BaseModel): relevant_file: str = Field(description="the relevant file full path") language: str = Field(description="the code language of the relevant file") suggestion_content: str = Field(description="an actionable suggestion for meaningfully improving the new code introduced in the PR") existing_code: str = Field(description="a short code snippet, demonstrating the relevant code lines from a 'new hunk' section. It must be without line numbers. Use abbreviations if needed") improved_code: str = Field(description="a new code snippet, that can be used to replace the relevant 'existing_code' lines in 'new hunk' code after applying the suggestion") one_sentence_summary: str = Field(description="a short summary of the suggestion action, in a single sentence. Focus on the 'what'. Be general, and avoid method or variable names.") relevant_lines_start: int = Field(description="The relevant line number, from a 'new hunk' section, where the suggestion starts (inclusive). Should be derived from the hunk line numbers, and correspond to the 'existing code' snippet above") relevant_lines_end: int = Field(description="The relevant line number, from a 'new hunk' section, where the suggestion ends (inclusive). Should be derived from the hunk line numbers, and correspond to the 'existing code' snippet above") label: str = Field(description="a single label for the suggestion, to help the user understand the suggestion type. For example: 'security', 'possible bug', 'possible issue', 'performance', 'enhancement', 'best practice', 'maintainability', etc. Other labels are also allowed")

class PRCodeSuggestions(BaseModel): code_suggestions: List[CodeSuggestion]

Example output:

code_suggestions:
- relevant_file: |
    src/file1.py
  language: |
    python
  suggestion_content: |
    ...
  existing_code: |
    ...
  improved_code: |
    ...
  one_sentence_summary: |
    ...
  relevant_lines_start: 12
  relevant_lines_end: 13
  label: |
    ...

Each YAML output MUST be after a newline, indented, with block scalar indicator ('|'). 2024-07-03 22:29:01.363 | INFO | pr_agent.algo.ai_handlers.litellm_ai_handler:chat_completion:159 - User prompt: PR Info:

Title: 'xxx'

The PR Diff:

file: 'pr-agent.py'

@@ -0,0 +1,9 @@ new hunk 1 +# This program prints Hello, world! 2 + 3 +printfx('Hello, world!') 4 + 5 +for i=0; i<10; i++ { 6 + printfx("how to build for loop") 7 +} 8 + 9 +# There aare soome typooo

Response (should be a valid YAML, and nothing else):

2024-07-03 22:29:14.256 | DEBUG    | pr_agent.algo.ai_handlers.litellm_ai_handler:chat_completion:176 - 
AI response:
code_suggestions:
- relevant_file: |
    pr-agent.py
  language: |
    python
  suggestion_content: |
    Replace the incorrect function `printfx` with the correct Python print function `print`.
  existing_code: |
    printfx('Hello, world!')
  improved_code: |
    print('Hello, world!')
  one_sentence_summary: |
    Correct the function used for printing output.
  relevant_lines_start: 3
  relevant_lines_end: 3
  label: |
    possible bug

- relevant_file: |
    pr-agent.py
  language: |
    python
  suggestion_content: |
    Correct the syntax of the for loop to match Python's syntax. Use `range()` for the loop, and ensure proper indentation and colon usage.
  existing_code: |
    for i=0; i<10; i++ {
      printfx("how to build for loop")
    }
  improved_code: |
    for i in range(10):
        print("how to build for loop")
  one_sentence_summary: |
    Fix the syntax of the for loop to adhere to Python standards.
  relevant_lines_start: 5
  relevant_lines_end: 7
  label: |
    possible bug

- relevant_file: |
    pr-agent.py
  language: |
    python
  suggestion_content: |
    Correct the spelling mistakes in the comment to enhance code readability and professionalism.
  existing_code: |
    # There aare soome typooo
  improved_code: |
    # There are some typos
  one_sentence_summary: |
    Correct spelling errors in the comment.
  relevant_lines_start: 9
  relevant_lines_end: 9
  label: |
    enhancement

2024-07-03 22:29:14.257 | DEBUG | pr_agent.algo.ai_handlers.litellm_ai_handler:chat_completion:180 - Full_response 2024-07-03 22:29:14.257 | INFO | pr_agent.algo.ai_handlers.litellm_ai_handler:chat_completion:184 - AI response: code_suggestions:

Your input is a PR code, and a list of code suggestions that were generated for the PR. Your goal is to inspect, review and score the suggestsions. Be aware - the suggestions may not always be correct or accurate, and you should evaluate them in relation to the actual PR code diff presented. Sometimes the suggestion may ignore parts of the actual code diff, and in that case, you should give it a score of 0.

Specific instructions:

The format that is used to present the PR code diff is as follows:

file: 'src/file1.py'

@@ ... @@ def func1(): new hunk 12 code line1 that remained unchanged in the PR 13 +new hunk code line2 added in the PR 14 code line3 that remained unchanged in the PR old hunk code line1 that remained unchanged in the PR -old hunk code line2 that was removed in the PR code line3 that remained unchanged in the PR

@@ ... @@ def func2(): new hunk ... old hunk ...

file: 'src/file2.py'

...

The output must be a YAML object equivalent to type $PRCodeSuggestionsFeedback, according to the following Pydantic definitions:

class CodeSuggestionFeedback(BaseModel): suggestion_summary: str = Field(description="repeated from the input") relevant_file: str = Field(description="repeated from the input") suggestion_score: int = Field(description="The actual output - the score of the suggestion, from 0 to 10. Give 0 if the suggestion is plain wrong. Otherwise, give a score from 1 to 10 (inclusive), where 1 is the lowest and 10 is the highest.") why: str = Field(description="Short and concise explanation of why the suggestion received the score (one to two sentences).")

class PRCodeSuggestionsFeedback(BaseModel): code_suggestions: List[CodeSuggestionFeedback]

Example output:

code_suggestions:
- suggestion_summary: |
    Use a more descriptive variable name here
  relevant_file: "src/file1.py"
  suggestion_score: 6
  why: |
    The variable name 't' is not descriptive enough
- ...

Each YAML output MUST be after a newline, indented, with block scalar indicator ('|'). 2024-07-03 22:29:14.263 | INFO | pr_agent.algo.ai_handlers.litellm_ai_handler:chat_completion:159 - User prompt: You are given a Pull Request (PR) code diff:

file: 'pr-agent.py'

@@ -0,0 +1,9 @@ new hunk 1 +# This program prints Hello, world! 2 + 3 +printfx('Hello, world!') 4 + 5 +for i=0; i<10; i++ { 6 + printfx("how to build for loop") 7 +} 8 + 9 +# There aare soome typooo

And here is a list of corresponding 3 code suggestions to improve this Pull Request code:

suggestion 1: {'relevant_file': 'pr-agent.py\n', 'language': 'python\n', 'suggestion_content': 'Replace the incorrect function printfx with the correct Python print function print.\n', 'existing_code': "printfx('Hello, world!')\n", 'improved_code': "print('Hello, world!')\n", 'one_sentence_summary': 'Correct the function used for printing output.\n', 'relevant_lines_start': 3, 'relevant_lines_end': 3, 'label': 'possible bug\n'}

suggestion 2: {'relevant_file': 'pr-agent.py\n', 'language': 'python\n', 'suggestion_content': "Correct the syntax of the for loop to match Python's syntax. Use range() for the loop, and ensure proper indentation and colon usage.\n", 'existing_code': 'for i=0; i<10; i++ {\n printfx("how to build for loop")\n}\n', 'improved_code': 'for i in range(10):\n print("how to build for loop")\n', 'one_sentence_summary': 'Fix the syntax of the for loop to adhere to Python standards.\n', 'relevant_lines_start': 5, 'relevant_lines_end': 7, 'label': 'possible bug\n'}

suggestion 3: {'relevant_file': 'pr-agent.py\n', 'language': 'python\n', 'suggestion_content': 'Correct the spelling mistakes in the comment to enhance code readability and professionalism.\n', 'existing_code': '# There aare soome typooo\n', 'improved_code': '# There are some typos\n', 'one_sentence_summary': 'Correct spelling errors in the comment.\n', 'relevant_lines_start': 9, 'relevant_lines_end': 9, 'label': 'enhancement\n'}

Response (should be a valid YAML, and nothing else):

2024-07-03 22:29:17.727 | DEBUG    | pr_agent.algo.ai_handlers.litellm_ai_handler:chat_completion:176 - 
AI response:
```yaml
code_suggestions:
- suggestion_summary: |
    Correct the function used for printing output.
  relevant_file: "pr-agent.py"
  suggestion_score: 10
  why: |
    The suggestion correctly identifies and fixes the incorrect function `printfx` to the correct Python `print` function, which is crucial for the code to run properly.

- suggestion_summary: |
    Fix the syntax of the for loop to adhere to Python standards.
  relevant_file: "pr-agent.py"
  suggestion_score: 10
  why: |
    The suggestion accurately corrects the for loop syntax to match Python's standards, including the use of `range()` and proper indentation, which is essential for the code to function correctly.

- suggestion_summary: |
    Correct spelling errors in the comment.
  relevant_file: "pr-agent.py"
  suggestion_score: 7
  why: |
    The suggestion improves code readability and professionalism by correcting spelling mistakes in the comment, although it does not affect the functionality of the code.

2024-07-03 22:29:17.728 | DEBUG | pr_agent.algo.ai_handlers.litellm_ai_handler:chat_completion:180 - Full_response 2024-07-03 22:29:17.728 | INFO | pr_agent.algo.ai_handlers.litellm_ai_handler:chat_completion:184 - AI response:

code_suggestions:
- suggestion_summary: |
    Correct the function used for printing output.
  relevant_file: "pr-agent.py"
  suggestion_score: 10
  why: |
    The suggestion correctly identifies and fixes the incorrect function `printfx` to the correct Python `print` function, which is crucial for the code to run properly.

- suggestion_summary: |
    Fix the syntax of the for loop to adhere to Python standards.
  relevant_file: "pr-agent.py"
  suggestion_score: 10
  why: |
    The suggestion accurately corrects the for loop syntax to match Python's standards, including the use of `range()` and proper indentation, which is essential for the code to function correctly.

- suggestion_summary: |
    Correct spelling errors in the comment.
  relevant_file: "pr-agent.py"
  suggestion_score: 7
  why: |
    The suggestion improves code readability and professionalism by correcting spelling mistakes in the comment, although it does not affect the functionality of the code.

2024-07-03 22:29:17.743 | DEBUG | pr_agent.tools.pr_code_suggestions:run:114 - PR output