Speed improvement - Githubissues

antoinecarme commented 6 years ago

Running a profiler shows that the training process spends a lot of time normalize the date column. It is possible to cache this function values through the training process, which leads to a significant speed improvement

Profiler command :

python3 -m cProfile tests/func/test_ozone.py

sample output :


INFO:pyaf.std:END_TRAINING_TIME_IN_SECONDS 'Ozone' 3.797365665435791
...

...
         1762583 function calls (1738911 primitive calls) in 5.604 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.000    0.000 <decorator-gen-0>:1(<module>)
        1    0.000    0.000    0.000    0.000 <decorator-gen-10>:1(<module>)
....
        1    0.000    0.000    0.000    0.000 Time.py:14(cTimeInfo)
       14    0.000    0.000    0.012    0.001 Time.py:159(cutFrame)
        4    0.000    0.000    0.003    0.001 Time.py:16(__init__)
      216    0.000    0.000    0.000    0.000 Time.py:218(isOneRowDataset)
      216    0.012    0.000    0.013    0.000 Time.py:300(normalizeTime)
       25    0.000    0.000    0.000    0.000 Time.py:306(addMonths)
       25    0.001    0.000    0.002    0.000 Time.py:311(nextTime)
        1    0.000    0.000    0.000    0.000 Time.py:39(info)
        1    0.000    0.000    0.001    0.001 Time.py:51(to_json)
        1    0.000    0.000    0.004    0.004 Time.py:59(addVars)
       25    0.000    0.000    0.000    0.000 Time.py:66(get_time_dtype)
        1    0.000    0.000    0.000    0.000 Time.py:7(<module>)
       25    0.000    0.000    0.003    0.000 Time.py:76(checkDateAndSignalTypesForNewDataset)

sampel output :

antoinecarme commented 6 years ago

Fix : add a memoization decoration around this function.

No test impact is expected (except on training time ;)

antoinecarme commented 6 years ago

Fixed.

antoinecarme / pyaf

Speed improvement #82