Open apozharski opened 4 months ago
Hello, While porting sphinx-contrib/matlabdomain I have run into severe performance issues .
An example of a file with extremely poor parsing performance would be: https://github.com/nurkanovic/nosnoc/blob/0caa4509faa7a979da229a8617ae123b9ae02aa5/src/NosnocModel.m
which takes 5 minutes to parse on my machine. Below is the cProfile trace sorted by internal time.
cProfile
365740104 function calls (339252023 primitive calls) in 303.003 seconds Ordered by: internal time ncalls tottime percall cumtime percall filename:lineno(function) 8136370/49585 36.358 0.000 300.842 0.006 /home/anton/tools/textmate-grammar-python/src/textmate_grammar/parser.py:545(_parse) 16525617 29.903 0.000 151.679 0.000 /home/anton/tools/textmate-grammar-python/src/textmate_grammar/handler.py:217(search) 17027461 28.075 0.000 28.075 0.000 {built-in method _onigurumacffi.onigcffi_search} 17027461 27.770 0.000 116.194 0.000 /home/anton/tools/matlabdomain/venv/lib/python3.10/site-packages/onigurumacffi.py:129(search) 7985230 23.077 0.000 119.386 0.000 /home/anton/tools/textmate-grammar-python/src/textmate_grammar/parser.py:273(_parse) 16525617 22.555 0.000 175.059 0.000 /home/anton/tools/textmate-grammar-python/src/textmate_grammar/parser.py:154(match_and_capture) 17257055/6802 18.700 0.000 302.784 0.045 /home/anton/tools/textmate-grammar-python/src/textmate_grammar/utils/logger.py:17(wrapper) 17027461 15.719 0.000 22.917 0.000 /home/anton/tools/matlabdomain/venv/lib/python3.10/site-packages/onigurumacffi.py:83(_start_params) 17027461 12.176 0.000 23.123 0.000 /home/anton/tools/matlabdomain/venv/lib/python3.10/site-packages/onigurumacffi.py:87(_region) 17027461 10.600 0.000 12.263 0.000 /home/anton/tools/matlabdomain/venv/lib/python3.10/site-packages/onigurumacffi.py:91(_match_ret) 1131845/3192 10.535 0.000 302.690 0.095 /home/anton/tools/textmate-grammar-python/src/textmate_grammar/parser.py:363(_parse) 16447937 9.431 0.000 14.193 0.000 /home/anton/tools/textmate-grammar-python/src/textmate_grammar/utils/logger.py:125(debug) 17027461 6.375 0.000 6.375 0.000 {built-in method _onigurumacffi.onig_region_new} 42394241 5.363 0.000 5.363 0.000 {built-in method builtins.len} 17284783 5.028 0.000 5.028 0.000 /usr/lib/python3.10/logging/__init__.py:1710(getEffectiveLevel) 34054922 4.900 0.000 4.900 0.000 {method 'encode' of 'str' objects} 17027461 4.572 0.000 4.572 0.000 {method 'gc' of '_cffi_backend.FFI' objects} 4283624 4.247 0.000 6.784 0.000 /home/anton/tools/textmate-grammar-python/src/textmate_grammar/handler.py:181(read) 9230629 3.900 0.000 5.175 0.000 /home/anton/tools/textmate-grammar-python/src/textmate_grammar/parser.py:73(disabled) 26515200 3.817 0.000 3.817 0.000 {method 'get' of 'dict' objects} 1131845 2.746 0.000 6.959 0.000 /home/anton/tools/textmate-grammar-python/src/textmate_grammar/parser.py:380(<listcomp>) 2781976 2.730 0.000 3.895 0.000 /home/anton/tools/matlabdomain/venv/lib/python3.10/site-packages/onigurumacffi.py:66(start) 4473120 2.237 0.000 2.702 0.000 /home/anton/tools/textmate-grammar-python/src/textmate_grammar/handler.py:68(_check_pos) 289434 1.953 0.000 2.353 0.000 /home/anton/tools/textmate-grammar-python/src/textmate_grammar/handler.py:106(range) 289434 1.651 0.000 7.744 0.000 /home/anton/tools/textmate-grammar-python/src/textmate_grammar/handler.py:137(<dictcomp>) 4336744 1.029 0.000 1.029 0.000 {method 'decode' of 'bytes' objects} 2326726 1.009 0.000 1.009 0.000 /home/anton/tools/matlabdomain/venv/lib/python3.10/site-packages/onigurumacffi.py:48(__init__) 1065644 0.799 0.000 1.119 0.000 /home/anton/tools/matlabdomain/venv/lib/python3.10/site-packages/onigurumacffi.py:69(end) 2326726 0.653 0.000 0.653 0.000 /home/anton/tools/matlabdomain/venv/lib/python3.10/site-packages/onigurumacffi.py:36(_check) 89998 0.533 0.000 1.496 0.000 /home/anton/tools/textmate-grammar-python/src/textmate_grammar/parser.py:589(<listcomp>) 289434 0.486 0.000 10.582 0.000 /home/anton/tools/textmate-grammar-python/src/textmate_grammar/handler.py:128(chars) 657139 0.451 0.000 0.658 0.000 /home/anton/tools/textmate-grammar-python/src/textmate_grammar/utils/logger.py:131(info) 489123 0.409 0.000 0.504 0.000 /home/anton/tools/matlabdomain/venv/lib/python3.10/site-packages/onigurumacffi.py:61(group) 4028091 0.402 0.000 0.402 0.000 {method 'append' of 'list' objects} 289434 0.344 0.000 0.344 0.000 /home/anton/tools/textmate-grammar-python/src/textmate_grammar/elements.py:148(__init__) 2014281 0.342 0.000 0.342 0.000 {method 'isspace' of 'str' objects} 270455 0.320 0.000 0.320 0.000 /home/anton/tools/textmate-grammar-python/src/textmate_grammar/elements.py:26(__init__) 61729 0.285 0.000 0.285 0.000 /home/anton/tools/textmate-grammar-python/src/textmate_grammar/parser.py:823(<listcomp>) 94748 0.251 0.000 0.416 0.000 /home/anton/tools/textmate-grammar-python/src/textmate_grammar/handler.py:139(read_pos) 75983 0.233 0.000 0.339 0.000 /home/anton/tools/textmate-grammar-python/src/textmate_grammar/elements.py:448(__init__) 391225 0.207 0.000 0.207 0.000 {built-in method builtins.repr} 179705 0.138 0.000 0.197 0.000 /home/anton/tools/textmate-grammar-python/src/textmate_grammar/utils/logger.py:137(warning) 126893 0.099 0.000 0.099 0.000 {method 'index' of 'list' objects} 126893 0.091 0.000 0.190 0.000 /home/anton/tools/textmate-grammar-python/src/textmate_grammar/parser.py:659(<lambda>) 12919 0.090 0.000 0.279 0.000 {built-in method builtins.sorted} 272694 0.079 0.000 0.079 0.000 {method 'extend' of 'list' objects} 6663 0.054 0.000 2.558 0.000 /home/anton/tools/textmate-grammar-python/src/textmate_grammar/elements.py:72(dispatch) 24664/18001 0.046 0.000 2.622 0.000 /home/anton/tools/textmate-grammar-python/src/textmate_grammar/elements.py:122(_dispatch_list) 32101 0.037 0.000 0.040 0.000 /home/anton/tools/textmate-grammar-python/src/textmate_grammar/handler.py:72(next) 32920 0.031 0.000 0.107 0.000 /home/anton/tools/matlabdomain/venv/lib/python3.10/site-packages/onigurumacffi.py:72(span) 39214 0.030 0.000 0.030 0.000 /home/anton/tools/textmate-grammar-python/src/textmate_grammar/handler.py:89(prev) 13457/1 0.023 0.000 2.661 2.661 /home/anton/tools/textmate-grammar-python/src/textmate_grammar/elements.py:192(_dispatch) 3610 0.020 0.000 0.080 0.000 /home/anton/tools/textmate-grammar-python/src/textmate_grammar/parser.py:215(_parse) 2272/1 0.015 0.000 2.661 2.661 /home/anton/tools/textmate-grammar-python/src/textmate_grammar/elements.py:502(_dispatch) 16186 0.013 0.000 0.013 0.000 /home/anton/tools/textmate-grammar-python/src/textmate_grammar/handler.py:170(read_line) 9676 0.011 0.000 0.011 0.000 /home/anton/tools/textmate-grammar-python/src/textmate_grammar/parser.py:77(__repr__) 13456 0.009 0.000 0.011 0.000 /home/anton/tools/textmate-grammar-python/src/textmate_grammar/elements.py:209(__eq__) 41079 0.008 0.000 0.008 0.000 {built-in method builtins.isinstance} 7554 0.006 0.000 0.011 0.000 /home/anton/tools/matlabdomain/venv/lib/python3.10/site-packages/onigurumacffi.py:111(number_of_captures) 1 0.006 0.006 0.007 0.007 /home/anton/tools/matlabdomain/venv/lib/python3.10/site-packages/charset_normalizer/api.py:33(from_bytes) 7554 0.005 0.000 0.005 0.000 {built-in method _onigurumacffi.onig_number_of_captures} 7435 0.004 0.000 0.007 0.000 /home/anton/tools/textmate-grammar-python/src/textmate_grammar/elements.py:59(__eq__) 13603 0.003 0.000 0.003 0.000 {method 'pop' of 'dict' objects} 1778 0.002 0.000 0.002 0.000 /home/anton/tools/textmate-grammar-python/src/textmate_grammar/parser.py:256(__repr__) 1462 0.002 0.000 0.002 0.000 /home/anton/tools/textmate-grammar-python/src/textmate_grammar/parser.py:521(__repr__) 6664 0.001 0.000 0.001 0.000 {method 'items' of 'dict' objects} 1 0.001 0.001 0.001 0.001 {method 'findall' of 're.Pattern' objects} 363 0.000 0.000 0.000 0.000 /home/anton/tools/textmate-grammar-python/src/textmate_grammar/parser.py:481(<genexpr>) 1 0.000 0.000 0.000 0.000 /home/anton/tools/textmate-grammar-python/src/textmate_grammar/handler.py:53(<listcomp>) 1 0.000 0.000 0.000 0.000 /home/anton/tools/textmate-grammar-python/src/textmate_grammar/handler.py:52(<listcomp>) 11 0.000 0.000 0.000 0.000 {method 'split' of 'str' objects} 28 0.000 0.000 0.000 0.000 {built-in method posix.lstat} 182 0.000 0.000 0.000 0.000 {built-in method builtins.next} 4 0.000 0.000 0.000 0.000 /usr/lib/python3.10/posixpath.py:401(_joinrealpath) 28 0.000 0.000 0.000 0.000 /usr/lib/python3.10/posixpath.py:71(join) 3 0.000 0.000 0.000 0.000 {method 'replace' of 'str' objects} 5 0.000 0.000 0.000 0.000 /usr/lib/python3.10/pathlib.py:56(parse_parts) 2 0.000 0.000 0.000 0.000 {built-in method builtins.max} 6 0.000 0.000 0.000 0.000 {built-in method posix.stat} 1 0.000 0.000 303.003 303.003 /home/anton/tools/textmate-grammar-python/src/textmate_grammar/parsers/base.py:115(parse_file) 4 0.000 0.000 0.000 0.000 /usr/lib/python3.10/posixpath.py:338(normpath) 1 0.000 0.000 0.000 0.000 {built-in method io.open} 13 0.000 0.000 0.000 0.000 /usr/lib/python3.10/pathlib.py:621(__str__) 5 0.000 0.000 0.000 0.000 /usr/lib/python3.10/pathlib.py:569(_parse_args) 1 0.000 0.000 0.000 0.000 {method 'read' of '_io.BufferedReader' objects} 5 0.000 0.000 0.000 0.000 /usr/lib/python3.10/pathlib.py:589(_from_parts) 4 0.000 0.000 0.001 0.000 /usr/lib/python3.10/pathlib.py:1064(resolve) 7 0.000 0.000 0.000 0.000 /home/anton/tools/matlabdomain/venv/lib/python3.10/site-packages/charset_normalizer/utils.py:368(cut_sequence_chunks) 35 0.000 0.000 0.000 0.000 {built-in method sys.intern} 1 0.000 0.000 0.000 0.000 /home/anton/tools/matlabdomain/venv/lib/python3.10/site-packages/charset_normalizer/utils.py:268(identify_sig_or_bom) 8 0.000 0.000 0.000 0.000 /usr/lib/python3.10/posixpath.py:60(isabs) 36 0.000 0.000 0.000 0.000 /usr/lib/python3.10/posixpath.py:41(_get_sep) 4 0.000 0.000 0.001 0.000 /usr/lib/python3.10/posixpath.py:392(realpath) 44 0.000 0.000 0.000 0.000 {method 'startswith' of 'str' objects} 28 0.000 0.000 0.000 0.000 {method 'partition' of 'str' objects} 5 0.000 0.000 0.000 0.000 /usr/lib/python3.10/pathlib.py:239(splitroot) 1 0.000 0.000 0.000 0.000 {method '__exit__' of '_io._IOBase' objects} 53 0.000 0.000 0.000 0.000 {built-in method posix.fspath} 1 0.000 0.000 0.008 0.008 /home/anton/tools/textmate-grammar-python/src/textmate_grammar/handler.py:56(from_path) 3 0.000 0.000 0.000 0.000 /usr/lib/python3.10/logging/__init__.py:1724(isEnabledFor) 1 0.000 0.000 300.334 300.334 /home/anton/tools/textmate-grammar-python/src/textmate_grammar/parsers/base.py:173(_parse) 5 0.000 0.000 0.000 0.000 /usr/lib/python3.10/pathlib.py:608(_format_parsed_parts) 11 0.000 0.000 0.000 0.000 /usr/lib/python3.10/pathlib.py:631(__fspath__) 1 0.000 0.000 0.001 0.001 /home/anton/tools/matlabdomain/venv/lib/python3.10/site-packages/charset_normalizer/utils.py:215(any_specified_encoding) 1 0.000 0.000 0.007 0.007 /home/anton/tools/matlabdomain/venv/lib/python3.10/site-packages/charset_normalizer/api.py:532(from_path) 4 0.000 0.000 0.000 0.000 /usr/lib/python3.10/posixpath.py:377(abspath) 1 0.000 0.000 0.001 0.001 /home/anton/tools/textmate-grammar-python/src/textmate_grammar/handler.py:32(__init__) 1 0.000 0.000 0.000 0.000 /home/anton/tools/textmate-grammar-python/src/textmate_grammar/utils/cache.py:79(save) 28 0.000 0.000 0.000 0.000 {built-in method _stat.S_ISLNK} 1 0.000 0.000 0.000 0.000 /home/anton/tools/matlabdomain/venv/lib/python3.10/site-packages/charset_normalizer/utils.py:290(iana_name) 2 0.000 0.000 0.000 0.000 /home/anton/tools/textmate-grammar-python/src/textmate_grammar/utils/cache.py:14(_path_to_key) 1 0.000 0.000 0.000 0.000 /home/anton/tools/textmate-grammar-python/src/textmate_grammar/utils/logger.py:75(configure) 11 0.000 0.000 0.000 0.000 {method 'startswith' of 'bytes' objects} 28 0.000 0.000 0.000 0.000 {method 'endswith' of 'str' objects} 6 0.000 0.000 0.000 0.000 /usr/lib/python3.10/pathlib.py:1092(stat) 1 0.000 0.000 0.007 0.007 /home/anton/tools/matlabdomain/venv/lib/python3.10/site-packages/charset_normalizer/api.py:502(from_fp) 1 0.000 0.000 302.994 302.994 /home/anton/tools/textmate-grammar-python/src/textmate_grammar/parsers/base.py:161(_parse_language) 2 0.000 0.000 0.000 0.000 /usr/lib/python3.10/logging/__init__.py:219(_acquireLock) 1 0.000 0.000 300.334 300.334 /home/anton/tools/textmate-grammar-python/src/textmate_grammar/parser.py:129(parse) 1 0.000 0.000 0.000 0.000 /home/anton/tools/matlabdomain/venv/lib/python3.10/site-packages/charset_normalizer/cd.py:291(merge_coherence_ratios) 1 0.000 0.000 0.000 0.000 /usr/lib/python3.10/pathlib.py:957(__new__) 1 0.000 0.000 0.000 0.000 /usr/lib/python3.10/re.py:288(_compile) 9 0.000 0.000 0.000 0.000 {method 'join' of 'str' objects} 1 0.000 0.000 0.000 0.000 /home/anton/tools/matlabdomain/venv/lib/python3.10/site-packages/charset_normalizer/models.py:257(append) 5 0.000 0.000 0.000 0.000 {built-in method __new__ of type object at 0x5acc750939a0} 1 0.000 0.000 0.001 0.001 /usr/lib/python3.10/re.py:232(findall) 1 0.000 0.000 0.000 0.000 /usr/lib/python3.10/pathlib.py:718(suffix) 1 0.000 0.000 0.000 0.000 /home/anton/tools/matlabdomain/venv/lib/python3.10/site-packages/charset_normalizer/models.py:11(__init__) 1 0.000 0.000 0.000 0.000 /home/anton/tools/matlabdomain/venv/lib/python3.10/site-packages/charset_normalizer/models.py:237(__getitem__) 2 0.000 0.000 0.000 0.000 /usr/lib/python3.10/logging/__init__.py:1532(log) 5 0.000 0.000 0.000 0.000 {method 'lstrip' of 'str' objects} 2 0.000 0.000 0.000 0.000 /home/anton/tools/matlabdomain/venv/lib/python3.10/site-packages/charset_normalizer/models.py:231(__init__) 1 0.000 0.000 0.000 0.000 /home/anton/tools/matlabdomain/venv/lib/python3.10/site-packages/charset_normalizer/models.py:197(could_be_from_charset) 2 0.000 0.000 0.000 0.000 /usr/lib/python3.10/logging/__init__.py:228(_releaseLock) 1 0.000 0.000 0.000 0.000 /home/anton/tools/textmate-grammar-python/src/textmate_grammar/utils/cache.py:58(cache_valid) 2 0.000 0.000 0.000 0.000 {method 'acquire' of '_thread.RLock' objects} 1 0.000 0.000 0.000 0.000 {method 'format' of 'str' objects} 1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects} 1 0.000 0.000 0.000 0.000 {built-in method builtins.sum} 1 0.000 0.000 0.000 0.000 /usr/lib/python3.10/pathlib.py:710(name) 1 0.000 0.000 0.000 0.000 {built-in method builtins.round} 5 0.000 0.000 0.000 0.000 {method 'reverse' of 'list' objects} 1 0.000 0.000 0.000 0.000 /usr/lib/python3.10/pathlib.py:1285(exists) 1 0.000 0.000 0.000 0.000 /home/anton/tools/textmate-grammar-python/src/textmate_grammar/parsers/matlab/__init__.py:51(pre_process) 2 0.000 0.000 0.000 0.000 /usr/lib/python3.10/logging/__init__.py:1307(disable) 1 0.000 0.000 0.000 0.000 {built-in method builtins.min} 1 0.000 0.000 0.000 0.000 /home/anton/tools/matlabdomain/venv/lib/python3.10/site-packages/charset_normalizer/models.py:71(__str__) 1 0.000 0.000 0.000 0.000 /usr/lib/python3.10/logging/__init__.py:1455(debug) 1 0.000 0.000 0.000 0.000 /home/anton/tools/matlabdomain/venv/lib/python3.10/site-packages/charset_normalizer/models.py:277(best) 2 0.000 0.000 0.000 0.000 {method 'release' of '_thread.RLock' objects} 1 0.000 0.000 0.000 0.000 /home/anton/tools/matlabdomain/venv/lib/python3.10/site-packages/charset_normalizer/models.py:204(<listcomp>) 1 0.000 0.000 0.000 0.000 /home/anton/tools/matlabdomain/venv/lib/python3.10/site-packages/charset_normalizer/cd.py:305(<listcomp>) 1 0.000 0.000 0.000 0.000 {method 'lower' of 'str' objects} 1 0.000 0.000 0.000 0.000 {method 'rfind' of 'str' objects} 1 0.000 0.000 0.000 0.000 /home/anton/tools/matlabdomain/venv/lib/python3.10/site-packages/charset_normalizer/models.py:170(raw) 1 0.000 0.000 0.000 0.000 {method 'add' of 'set' objects}
Hello, While porting sphinx-contrib/matlabdomain I have run into severe performance issues .
An example of a file with extremely poor parsing performance would be: https://github.com/nurkanovic/nosnoc/blob/0caa4509faa7a979da229a8617ae123b9ae02aa5/src/NosnocModel.m
which takes 5 minutes to parse on my machine. Below is the
cProfile
trace sorted by internal time.