django-ftl / fluent-compiler

High performance Python implementation of Fluent, Mozilla's l10n language
Other
21 stars 4 forks source link

Optimisation: Avoid set copies when compiling files #31

Closed leamingrad closed 4 months ago

leamingrad commented 4 months ago

This PR aims to speed up file compilation by avoid unnecessary copying of sets.

Previously, the Scope class had methods of the form:

def names_in_use(self):
    names = self.names
    if self.parent_scope is not None:
        # This create a new set
        result = result | self.parent_scope.names_in_use()

    return names

These sets were used by various other codegen functions to check for membership. With large fluent files, this results in a lot of unnecessary set creation, as there might be many names added to a given scope.

This PR replaces these methods with query equivalents of the form:

def is_name_in_use(self, name: str) -> bool:
    if name in self.names:
        return True

    if self.parent_scope is None:
        return False

    return self.parent_scope.is_name_in_use(name)

By doing this we speed up compiling by about 60%. For the 10K line benchmark added in #30, it reduces the time from ~10s to ~4s.

Benchmark before ``` ❯ ./compiler.py -k 10k --benchmark-warmup=off ================================================================================================= test session starts ================================================================================================== platform darwin -- Python 3.10.14, pytest-8.2.1, pluggy-1.5.0 benchmark: 4.0.0 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000) rootdir: /Volumes/Code/fluent-compiler configfile: pyproject.toml plugins: anyio-4.3.0, hypothesis-6.102.6, benchmark-4.0.0 collected 4 items / 3 deselected / 1 selected compiler.py . [100%] ----------------------------------------------- benchmark: 1 tests ----------------------------------------------- Name (time in s) Min Max Mean StdDev Median IQR Outliers OPS Rounds Iterations ------------------------------------------------------------------------------------------------------------------ test_file_with_10k_items 9.7863 10.1248 9.9234 0.1377 9.8794 0.2118 1;0 0.1008 5 1 ------------------------------------------------------------------------------------------------------------------ Legend: Outliers: 1 Standard Deviation from Mean; 1.5 IQR (InterQuartile Range) from 1st Quartile and 3rd Quartile. OPS: Operations Per Second, computed as 1 / Mean ====================================================================================== 1 passed, 3 deselected in 69.96s (0:01:09) ====================================================================================== ```
Benchmark after ``` ❯ ./compiler.py -k 10k --benchmark-warmup=off ================================================================================================= test session starts ================================================================================================== platform darwin -- Python 3.10.14, pytest-8.2.1, pluggy-1.5.0 benchmark: 4.0.0 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000) rootdir: /Volumes/Code/fluent-compiler configfile: pyproject.toml plugins: anyio-4.3.0, hypothesis-6.102.6, benchmark-4.0.0 collected 4 items / 3 deselected / 1 selected compiler.py . [100%] ----------------------------------------------- benchmark: 1 tests ---------------------------------------------- Name (time in s) Min Max Mean StdDev Median IQR Outliers OPS Rounds Iterations ----------------------------------------------------------------------------------------------------------------- test_file_with_10k_items 3.9957 4.0290 4.0099 0.0127 4.0051 0.0164 2;0 0.2494 5 1 ----------------------------------------------------------------------------------------------------------------- Legend: Outliers: 1 Standard Deviation from Mean; 1.5 IQR (InterQuartile Range) from 1st Quartile and 3rd Quartile. OPS: Operations Per Second, computed as 1 / Mean =========================================================================================== 1 passed, 3 deselected in 29.47s ========================================================================================== ```
spookylukey commented 4 months ago

This looks great, thanks!

On my Linux box I'm seeing even bigger speedups, from ~28s to ~6s on the same benchmark.