vallettea / koala

Transpose your Excel calculations into python for better performances and scaling.
GNU General Public License v3.0
144 stars 59 forks source link

Small refactor of `Spreadsheet.from_dict` to make it faster. #213

Closed doconix closed 5 years ago

doconix commented 5 years ago

After profiling the code to see why from_dict was slow, it was obvious that the inner function find_cell was the culprit. This refactor removes the inner function and instead uses a temporary dictionary to make things fast.

The two profile outputs below show load time going from 2.42 -> .23 seconds.


Code used to test (this uses a fairly large, complex spreadsheet):
    from koala import Spreadsheet
    import cProfile, pstats
    pr = cProfile.Profile()
    file_name = '/Users/conan/Downloads/logicandreference-a9-Conan_Albrecht-1.xlsm'
    sp1 = Spreadsheet(file_name)
    data = sp1.asdict()
    pr.enable()
    for i in range(5):
        sp2 = Spreadsheet.from_dict(data)
    pr.disable()
    ps = pstats.Stats(pr).sort_stats('time')
    ps.print_stats(50)

Before the change:
         7158776 function calls (7158276 primitive calls) in 2.418 seconds

   Ordered by: internal time
   List reduced from 83 to 50 due to restriction <50>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
     5520    1.408    0.000    2.202    0.000 /Users/conan/Documents/data/programming/koala/koala/Spreadsheet.py:1073(find_cell)
  6889145    0.797    0.000    0.797    0.000 /Users/conan/Documents/data/programming/koala/koala/Cell.py:172(address)
     2135    0.050    0.000    0.050    0.000 {built-in method builtins.compile}
    10975    0.047    0.000    0.067    0.000 /Users/conan/Documents/data/programming/koala/koala/Cell.py:21(__init__)

After the change:
         297041 function calls (296541 primitive calls) in 0.230 seconds

   Ordered by: internal time
   List reduced from 83 to 50 due to restriction <50>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    10975    0.063    0.000    0.083    0.000 /Users/conan/Documents/data/programming/koala/koala/Cell.py:21(__init__)
     2135    0.041    0.000    0.041    0.000 {built-in method builtins.compile}
        5    0.024    0.005    0.053    0.011 /Users/conan/.pyenv/versions/me3.6/lib/python3.6/site-packages/networkx/readwrite/json_graph/node_link.py:104(node_link_graph)
danielsjf commented 5 years ago

This looks good to me.

vallettea commented 5 years ago

this is awesome, thanks