pappasam / jedi-language-server

A Python language server exclusively for Jedi. If Jedi supports it well, this language server should too.
MIT License
623 stars 45 forks source link

Slowness in get_opcodes when editing large files #322

Open correctmost opened 3 months ago

correctmost commented 3 months ago

Description

I have a ~500KB Python file that contains a tuple with thousands of strings. Editing this file in VS Code triggers high CPU usage in the run-jedi-language-server.py process.

The slowness seems to be caused by the following difflib.SequenceMatcher code in get_opcodes:

https://github.com/pappasam/jedi-language-server/blob/9df0ab5c3481c4e637a8cc1dc3bb5e7afb11c960/jedi_language_server/text_edit_utils.py#L142-L145

Steps to reproduce

1 - Save the following script as gen_text.py

import random
import string

for i in range(10000):
    print(''.join(random.choices(string.ascii_lowercase, k=100)))

2 - Run these commands in a terminal

python gen_text.py > before.py
cp before.py after.py
echo "jls_extract_def()" >> after.py

3 - Save the following script as jedi_test.py

from jedi_language_server.text_edit_utils import get_opcodes

with open('before.py') as f:
    before = f.read()

with open('after.py') as f:
    after = f.read()

opcodes = get_opcodes(before, after)
print(opcodes)

4 - Run time python jedi_test.py in a terminal

On my machine, the script takes about 16 seconds to run.

Related issues

https://github.com/python/cpython/issues/106865 warns about slowness with SequenceMatcher and mentions potential workarounds.