manusimidt / py-xbrl

Python-based parser for parsing XBRL and iXBRL files
https://py-xbrl.readthedocs.io/en/latest/
GNU General Public License v3.0
111 stars 40 forks source link

[speedup] namespace->taxonomy look up recursion replaced with fast LUT #70

Open mrx23dot opened 3 years ago

mrx23dot commented 3 years ago

namespace to taxonomy look up recursion replaced with fast LUT

before

2 758 449 function calls (2647063 primitive calls) in 1.893 seconds ncalls tottime percall cumtime percall filename:lineno(function) 419944 0.802 0.000 0.802 0.000 {method 'findall' of 're.Pattern' objects} --> bottleneck 209962 0.187 0.000 1.364 0.000 uri_helper.py:58(compare_uri) --> bottleneck 420027 0.130 0.000 0.209 0.000 re.py:271(_compile) 419924 0.124 0.000 1.109 0.000 re.py:215(findall) 105789/2126 0.119 0.000 1.483 0.001 taxonomy.py:170(get_taxonomy)

after

587 655 function calls (493573 primitive calls) in 0.496 seconds ncalls tottime percall cumtime percall filename:lineno(function) 21 0.110 0.005 0.110 0.005 {method '_parse_whole' of 'xml.etree.ElementTree.XMLParser' objects} --> bottleneck 17486 0.044 0.000 0.044 0.000 {method 'sub' of 're.Pattern' objects} 15/1 0.040 0.003 0.295 0.295 taxonomy.py:223(parse_taxonomy) 86870/511 0.036 0.000 0.043 0.000 taxonomy.py:170(get_taxonomy_LUT)

Takes lot less time and stack. Functionality unchanged. Tests pass.