salabim / ycecream

Sweeter debugging and benchmarking Python programs.
MIT License
52 stars 3 forks source link

Question/Feature Request: return timing information only #6

Closed PFython closed 3 years ago

PFython commented 3 years ago

I'd like to compare two functions loop1 and loop2, running each of them count times, and compare the average durations.

I've raised a separate issue about capturing the output e.g. y| returned None from loop1() in 0.414763 seconds and it's easy enough to parse that string to get just the duration 0.414763 as a float or datetime object, but I wondered if it would be an easy enough enhancement to add keyword argument e.g. x = y(loop(), duration_only=True) for users just wanting to record durations and do further analysis (calculate the mean, plot a graph comparing the two loops etc)?

Or have I just missed an easy way of achieving this from the README?

import requests
from ycecream import y

@y(show_exit=True, show_enter=False)
def loop1():
    x = page*count
    for i in range(len(x)):
        s=len(x)

@y(show_exit=True, show_enter=False)
def loop2():
    x = page*count
    lx = len(x)
    for i in range(lx):
        s = lx

if __name__ == "__main__":
    page = requests.get('https://www.bbc.co.uk/news').text
    count = 10
    for loop in [loop1, loop2]:
        results=[]
        for run in range(10):
            result = y(loop(), as_str=True, )
            results.append(result)
        print(results)

OUTPUT

y| returned None from loop1() in 0.414763 seconds
y| returned None from loop1() in 0.418484 seconds
y| returned None from loop1() in 0.408090 seconds
y| returned None from loop1() in 0.406661 seconds
y| returned None from loop1() in 0.418872 seconds
y| returned None from loop1() in 0.413751 seconds
y| returned None from loop1() in 0.418103 seconds
y| returned None from loop1() in 0.406673 seconds
y| returned None from loop1() in 0.406783 seconds
y| returned None from loop1() in 0.412694 seconds
['y| loop(): None\n', 'y| loop(): None\n', 'y| loop(): None\n', 'y| loop(): None\n', 'y| loop(): None\n', 'y| loop(): None\n', 'y| loop(): None\n', 'y| loop(): None\n', 'y| loop(): None\n', 'y| loop(): None\n']        
y| returned None from loop2() in 0.157985 seconds
y| returned None from loop2() in 0.158627 seconds
y| returned None from loop2() in 0.160737 seconds
y| returned None from loop2() in 0.165181 seconds
y| returned None from loop2() in 0.171878 seconds
y| returned None from loop2() in 0.165958 seconds
y| returned None from loop2() in 0.164515 seconds
y| returned None from loop2() in 0.160872 seconds
y| returned None from loop2() in 0.160113 seconds
y| returned None from loop2() in 0.165789 seconds
['y| loop(): None\n', 'y| loop(): None\n', 'y| loop(): None\n', 'y| loop(): None\n', 'y| loop(): None\n', 'y| loop(): None\n', 'y| loop(): None\n', 'y| loop(): None\n', 'y| loop(): None\n', 'y| loop(): None\n']        
salabim commented 3 years ago

I think I have answered this in issue #5

PFython commented 3 years ago

Many thanks for the quick replies and suggestsions Ruud! Very elegant. Using output= catches the results perfectly. In case it's of use here's my final code based on your suggestions:

from statistics import mean, variance
from ycecream import y

def collect(s):
    if not hasattr(y, "results"):
        y.results = {}
    func = s.split("from ")[1].split("()")[0]
    if not y.results.get(func):
        y.results[func] = []
    seconds = float(s.split("in ")[1].split(" seconds")[0])
    y.results[func].append(seconds)

def results(name):
    output = f"\nMean duration for <{name}>:\t"
    output += f"{mean(y.results[name])} seconds"
    output += f"\nMin/Max duration for <{name}>:\t"
    output += f"{min(y.results[name])} to {max(y.results[name])} seconds"
    output += f"\n\nResults for <{name}>:"
    output += f"\n{y.results[name]}"
    return output

benchmark = y.fork(output=collect, show_enter=False)

@benchmark()
def func1():
    for i in range(len(x)):
        s=len(x)

@benchmark()
def func2():
    lx = len(x)
    for i in range(lx):
        s = lx

if __name__ == "__main__":
    x = "x" *6500000
    for func in [func1, func2]:
        for run in range(10):
            func()
        print(results(func.__name__))

It might be beyond the scope of what you intended for ycecream but I do think it's a common scenario for people to want to compare two (or more) different functions and get back a range of duration values while debugging, so if it were possible to internalise something like collect and results into a convenience method in ycecream itself I think it might appeal to an even wider audience, but your call!

salabim commented 3 years ago

I think it is not possible to define a generic function to realise what you would like. But if you can specify it, I can have a look on how to implement that. For now, I think this issue can be closed.