splor-mg / notas

Base de conhecimento
https://splor-mg.github.io/notas/main
0 stars 0 forks source link

python profiling #36

Open fjuniorr opened 6 months ago

fjuniorr commented 6 months ago

Write your script profiling.py which calls the functions that you want to analysize and run

python -m cProfile -o results.prof profiling.py

After you can analyze the results using the pstats module

import pstats
from pstats import SortKey

old = pstats.Stats('results.prof')
old.sort_stats(SortKey.TIME).print_stats(10)
old.sort_stats(SortKey.TIME).print_stats(5, 'sessions.py') # first get the 5 bigger, then filter on sessions
old.sort_stats(SortKey.TIME).print_stats('sessions.py', 5)

keep an eye out for function calls that should be executed once by each input but are not.

Notes

The files cProfile and profile can also be invoked as a script to profile another script. For example:

python -m cProfile [-o output_file] [-s sort_order] (-m module | myscript.py)

Results

ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    20000    0.713    0.000    0.838    0.000 {built-in method io.open}
    20000    0.157    0.000    0.235    0.000 {method 'read' of '_io.TextIOWrapper' objects}
    20015    0.107    0.000    0.107    0.000 {method '__exit__' of '_io._IOBase' objects}
    20000    0.098    0.000    1.576    0.000 notes.py:4(__init__)
    20003    0.069    0.000    0.099    0.000 pathlib.py:64(parse_parts)
        1    0.054    0.054    1.717    1.717 notebook.py:16(<dictcomp>)
    20000    0.051    0.000    0.051    0.000 {built-in method _codecs.utf_8_decode}
    20002    0.041    0.000    0.145    0.000 pathlib.py:682(_parse_args)
    10001    0.037    0.000    0.081    0.000 pathlib.py:556(_select_from)
    20002    0.034    0.000    0.218    0.000 pathlib.py:1079(__new__)
    20000    0.031    0.000    0.031    0.000 {built-in method _locale.nl_langinfo}
    20000    0.029    0.000    0.059    0.000 pathlib.py:866(stem)
    20002    0.026    0.000    0.178    0.000 pathlib.py:702(_from_parts)
    20000    0.026    0.000    0.077    0.000 codecs.py:319(decode)
    20000    0.025    0.000    0.056    0.000 _bootlocale.py:33(getpreferredencoding)

tottime is the total time spent in the function alone. cumtime is the total time spent in the function plus all functions that this function called.

The two values is going to be the same if a function never calls anything else.

From the documentation:

tottime
for the total time spent in the given function (and excluding time made in calls to sub-functions)

[...]

cumtime
is the cumulative time spent in this and all subfunctions (from invocation till exit). This figure is accurate even for recursive functions.

Exporting results

To save in raw format, we use dump_stats() method that passes the argument of a directory where the file will be saved and filename. In this tutorial, we save the cProfile output in folder data and filename as cProfileExport.

stats = pstats.Stats(profiler)
stats.dump_stats('../data/cProfileExport')

# export as txt

result = io.StringIO()
stats = pstats.Stats(profiler, stream = result).sort_stats('ncalls')
stats.print_stats()# Save it into disk
with open('../data/cProfileExport.txt', 'w+') as f:
    f.write(result.getvalue())

# export as csv

result = io.StringIO()
stats = pstats.Stats(profiler, stream = result).sort_stats('ncalls')
stats.print_stats()
result = result.getvalue()# Chop the string into a csv-like buffer
result = 'ncalls' + result.split('ncalls')[-1]
result = '\n'.join([','.join(line.rstrip().split(None, 6)) for line in result.split('\n')])# Save it into disk
with open('../data/cProfileExport.csv', 'w+') as f:
    f.write(result)