Separate data processing and presentation parts for better code clarity

woung717 commented 3 months ago

Hey pr-agent team,

I'm a regular user of pr-agent for our team's code reviews and often tweak comment items and locales during deployment. However, I've faced some challenges with the markdown/HTML compilation process since it's closely integrated with the Python code, especially for reviews and descriptions tasks output.

I suggest we separate the data processing and presentation parts. We could move the compilation tasks into template files, specifically using Jinja2, as it's already part of our project.

Would love to hear your thoughts on this. If it sounds good, I'm ready to work on these changes and submit a PR.

Cheers

mrT23 commented 3 months ago

I don't understand what you mean by "markdown/HTML compilation".

present it in a clearer way. give examples for before and after, and refer to specific code lines, with links

woung717 commented 3 months ago

Hey, here is the "markdown/HTML compilation" part. https://github.com/Codium-ai/pr-agent/blob/2b77d07725b3e20c0ae9831837df0acad4b72b6b/pr_agent/tools/pr_reviewer.py#L236-L261 and convert_to_markdown_v2 funciton which called inside the above code. https://github.com/Codium-ai/pr-agent/blob/2b77d07725b3e20c0ae9831837df0acad4b72b6b/pr_agent/algo/utils.py#L87-L264

I think the manual combination in the code used to create markdown comment output makes it hard to debug and extend custom output items. Since Jinja2 template supports conditional and looping grammars, it can be converted to Jinja template files.

Therefore, part of converted template can be like this.

...
<table>
<tr><td>⏱️&nbsp;<strong>Estimated effort to review</strong>: {{ output.review_effort }} </td></tr>
<tr><td>🏅&nbsp;<strong>Score</strong>: {{ output.review_score }}</td></tr>
...
{% for item in output.custom_items %}
    <tr><td> {{ item.key }}: {{ item.value }}</td></tr>
{% endfor %}
</table>

...

mrT23 commented 3 months ago

The review tool markdown creation has many logics, ifs, stages, pre-processings, and so on. Hence I think it is less suitable for jinja2

jinja2 flow is harder to debug (no breakpoints), and is less manageable.

jinja2 is more suited for prompts creation. But it is less convenient for a flow rich with conditions and calculations. We need actual code for that.

Tal

woung717 commented 3 months ago

Yes, I noticed many logic-intensive sections of code within the markdown generation process. But, I think that the preprocessing and non-presentation components are currently well-separated from the presentation (markdown) elements. And they should divided more effectively when integrated with Jinja2 templates.

As for debugging, when a template is applied, the code generating output text elements (such as table labels and corresponding values) should reside in the actual Python code. This allows the template to have minimal dynamic code and primarily consist of markdown and HTML skeletons, aiding in understanding the overall output structure. This approach reduces the need for debugging within the template file and, I believe, makes the code easier to maintain over time. (Other template engines, such as Mako, offer more powerful features and enhanced debugging capabilities, but using multiple libraries for the same purpose may not be a good approach).

Ultimately, the decision depends on the main contributors and the community. Please let me know if this change is unnecessary, or if we can implement it in some adjusted manner.

HyeonUng

mrT23 commented 3 months ago

I have been working with jinja2 extensively in the last ~1.5 years. I assure you, it is dramatically harder to debug and maintain, compared to code. The main advantage I saw of it is that the toml files are more accessible to non-technical people, who can still engage with it and try different prompts. But the disadvantages are there.

If you contribute a new code\tool, for example, you are welcome to try and do that in a more 'jinja-like' manner, if you think its more convenient to you.

Existing code will remain in its current form, until we can show improvements from new content contributed in the new form.

woung717 commented 3 months ago

I got your point. I'll make a PR if there's a helpful tool with templates later. Have a great day!

Codium-ai / pr-agent

Separate data processing and presentation parts for better code clarity #1088