cdgriffith / puremagic

Pure python implementation of identifying files based off their magic numbers
MIT License
158 stars 34 forks source link

Small performance improvements #73

Closed cclauss closed 2 months ago

cclauss commented 3 months ago

Use ruff check --select=C4,PERF,UP to find and fix minor performance issues. https://docs.astral.sh/ruff

% ruff check --select=C4,PERF,UP --statistics | sort -k2

1   C408    [*] Unnecessary `list` call (rewrite as a literal)
1   PERF401 [ ] Use a list comprehension to create a transformed list
6   UP009   [*] UTF-8 encoding declaration is unnecessary
2   UP015   [*] Unnecessary open mode parameters
1   UP024   [*] Replace aliased errors with `OSError`
5   UP030   [*] Use implicit references for positional format fields
7   UP032   [*] Use f-string instead of `format` call

% ruff rule PERF401

manual-list-comprehension (PERF401)

Derived from the Perflint linter.

What it does

Checks for for loops that can be replaced by a list comprehension.

Why is this bad?

When creating a transformed list from an existing list using a for-loop, prefer a list comprehension. List comprehensions are more readable and more performant.

Using the below as an example, the list comprehension is ~10% faster on Python 3.11, and ~25% faster on Python 3.10.

Note that, as with all perflint rules, this is only intended as a micro-optimization, and will have a negligible impact on performance in most cases.

Example

original = list(range(10000))
filtered = []
for i in original:
    if i % 2:
        filtered.append(i)

Use instead:

original = list(range(10000))
filtered = [x for x in original if x % 2]

If you're appending to an existing list, use the extend method instead:

original = list(range(10000))
filtered.extend(x for x in original if x % 2)
cdgriffith commented 3 months ago

Thank you, a lot of this code was written with Python 2 still around so the upgrade to f-strings is much appreciated!