Investigate why we use Jinja2 for messages

bhirsz commented 2 weeks ago

For templated rule messages we are using Jinja2. It's not clear for me why we are using it since it is possible to just use string.format (as we are currently doing whenever we render message):

"Line is too long { line_len} / { max_len}".format(**kwargs)

Jinja replaced %s formatting in https://github.com/MarketSquare/robotframework-robocop/issues/534. Maybe @mnojek knows it?

We can additionaly check if it has huge impact on performance: create 50 short Templates, render them 10 times each and compare to string format. If the difference is minimal we could decide to leave it as it is.

bhirsz commented 2 weeks ago

Performance benchmark:

import random
import string
from timeit import default_timer as timer

import jinja2

def generate_messages(count: int) -> tuple[list[str], list[str]]:
    """Generate messages. This part is not timed."""
    base_message_format = "Line is too long {current_length}/{max_length}"
    base_message_template = "Line is too long {{ current_length }}/{{ max_length }}"
    messages_format, messages_template = [], []
    for i in range(count):
        random_suffix = " " + "".join(random.choice(string.ascii_letters) for _ in range(5))
        messages_format.append(base_message_format + random_suffix)
        messages_template.append(base_message_template + random_suffix)
    return messages_format, messages_template

def produce_templated(messages: list[str], multiply: int) -> None:
    start_time = timer()
    max_length = 120
    for message in messages:
        template = jinja2.Template(message)
        for _ in range(multiply):
            current_length = random.randint(1, 200)
            template.render(current_length=current_length, max_length=max_length)
    time_taken = timer() - start_time
    print(f"Templated messages produced in: {time_taken:.3f}s")

def produce_formatted(messages: list[str], multiply: int) -> None:
    start_time = timer()
    max_length = 120
    for message in messages:
        for _ in range(multiply):
            current_length = random.randint(1, 200)
            message.format(current_length=current_length, max_length=max_length)
    time_taken = timer() - start_time
    print(f"Formatted messages produced in: {time_taken:.3f}s")

if __name__ == "__main__":
    messages_format, messages_template = generate_messages(100)
    produce_templated(messages_template, 100)
    produce_formatted(messages_format, 100)

bhirsz commented 2 weeks ago

There is difference, buit quite minimal:

Templated messages produced in: 0.077s
Formatted messages produced in: 0.006s

Around 70ms difference for 100 rules, 100 messages each (10 000 total!). We can have it in mind, but it may not be worth changing. This may be worthwhile for LSP though - if, by average, we're losing 10-30ms it can add up. Just rendering 100 rules with template takes around 20ms.

mnojek commented 2 days ago

We should definitely drop jinja template, as it is not needed at all, and also it's slower. I would not like at it as 70ms difference, but more like a 1283% difference, which is quite impressive.

bhirsz / robotframework-cop-academy

Investigate why we use Jinja2 for messages #32