JoeyHendricks / python-benchmark-harness

A micro/macro benchmark framework for the Python programming language that helps with optimizing your software.
MIT License
154 stars 13 forks source link

Inconsistent tests: have I caused this problem, or is this expected behavior? #10

Closed afparsons closed 2 years ago

afparsons commented 3 years ago

Hi Joey,

I have implemented my ideas on the observer_pattern branch of my fork. You can examine the diff here, although be warned that it is rather large.

I'm most confident in the changes I made to QuickPotato.profiling.intrusive and QuickPotato.profiling.interpreters. I'm least confident in any of the changes I made to the other test infrastructure.

If I run test_pt_boundary_testing, all tests pass.

If I run test_pt_visualizations, the test fails, although I'm not sure it was every supposed to succeed in the first place, as FlameGraph() does not receive a test_id.

I'm most confused about the test_pt_regression_testing. If I run the tests again and again and again without making any changes to the code, I get different results. Observe the following screenshots, and note that they were all taken successively with no changes to code.

Screenshot from 2021-07-30 11-59-48

Screenshot from 2021-07-30 12-00-47

Screenshot from 2021-07-30 12-02-08

Screenshot from 2021-07-30 12-02-53

I'm confused about why this happens. Is this expected behavior, or have I messed something up?


Edit: In case you are confused about how my PerformanceBreakpoint decorator should be used, see the following example:

# standard library
from time import sleep
from math import log10

# QuickPotato
from QuickPotato.profiling.intrusive import PerformanceBreakpoint
from QuickPotato.profiling.interpreters import SimpleInterpreter, StatisticsInterpreter

@PerformanceBreakpoint
def function_1():
    sleep(1)
    return len(str(log10(7**7**7)))

@PerformanceBreakpoint(observers=[StatisticsInterpreter])
def function_2():
    sleep(1)
    return len(str(log10(7**7**7)))

@PerformanceBreakpoint(observers=[SimpleInterpreter])
def function_3():
    sleep(1)
    return len(str(log10(7**7**7)))

# --- and now in console ---

>>> from QuickPotato import performance_test
... from ??? import function_1, function_2, function_3

# required in order to set performance_test.test_id
>>> performance_test.test_case_name = 'MyDemonstrativeTest'

>>> function_3()
# runs the SimpleInterpreter, which just prints to console, and returns the function's value
SimpleInterpreter
 ├─ self.method_name='function_3'
 ├─ self.test_id='GU98BK70CBI9'
 ├─ self.sample_id='BHVO2ZDN'
 ├─ self.database_name='QuickProfiling'
 ├─ subject.profiler.total_response_time=1.1437058448791504
17

>>> function_1()
# simply runs the function without profiling
17

>>> function_2()
# runs the StatisticsInterpreter, which logs to the database, and returns the function's value
17

~I've not tested the execution_wrapper parameter yet, but I think I need to use that with Dagster. When I used JoeyHendricks/QuickPotato:master with Dagster:~

~Based on that experience, I think I will need to respectively pass the following to execution_wrapper:~


Edit 2: re Dagster Nope. I'm going to have to experiment a bit more with decorator voodoo.

JoeyHendricks commented 3 years ago

Hi Andrew,

I wouldn't trust my unit tests, I haven't updated them properly in a while and I have had it on my radar to fix them up a bit more properly. As I was the only person having a bit of trouble with those tests I did not prioritize fixing them so sorry for that :D.

Well done I am really liking the observer design pattern but I don't yet understand the idea behind it but I think it would be a great improvement to the framework! As the changes you have made are quite large in number we could schedule a call through Discord or something to go through them would you be up for that? (My time zone is CEST and I am available only after my office hours so 17:00 to 23:00 and all day on the weekends of course ;) ) Perhaps you could also elaborate a bit more on what you wish the change further :).


Edit: I have made a official branch to further experiment with adding the observer pattern into QuickPotato :)

afparsons commented 3 years ago

I don't yet understand the idea behind it

Sure; I probably haven't explained my idea yet. I don't have time at this moment to go into detail, but here's an overview.

we could schedule a call

I'll get back to you about setting up a specific time. I have a busy week ahead of me :sweat:


Presently, QuickPotato runs in the following manner:

  1. the decorated function is called
  2. the code's execution is profiled
  3. the profiling results are inserted into the database
  4. the function's output is returned

In all of the workflows you were imagining, step 3 seems perfectly reasonable. But what if a programmer does not want to write the results to the database? Or what if they want to perform multiple actions with the profiled data, including writing to the database (and they aren't too concerned about overhead :slightly_smiling_face: )? For example, they might want to let another function know about that test_id.

The observer pattern allows for the following:

  1. the decorated function is called
  2. the code's execution is profiled
  3. observers are provided the profiling information a. write to database b. or just print results to console c. or do both d. _and maybe provide the test_id to a completely different function_ e. et cetera
  4. the function's output is returned

This grants a programmer far more flexibility.

For example, I've attached a LoggableInterpreter to my project. This is essentially the SimpleInterpreter from my branch, but with logging.info instead of print. (the total_response_time is wrong... I'm still working on that):

image


I am focused on trying to get something working with Dagster. I know you're probably indifferent about my desired use case, but I'll nonetheless keep documenting my progress:

~Dagster's execute_pipeline requires a Union[PipelineDefinition, IPipeline] as input. Source code~

~This means that if I were to wrap @pipeline, then the @PerformanceBreakpoint decorator would have to return a PipelineDefinition.~

@PerformanceBreakpoint(...)
@pipeline(...)
def my_cool_dagster_pipeline(...):
    ...

~But I am afraid that wrapping the @pipeline will just profile the pipeline construction, which is not what I want.~ ~This means that I need the @pipeline to wrap @PerformanceBreakpoint.~

~Here is the source code for the pipeline decorator. I'm trying to figure out how to proceed.~

Edit: after looking through Dagster's source code some more, I am pretty sure what I am trying to achieve is impossible. I think I will just have to wait until they add some kind of official implementation.

JoeyHendricks commented 2 years ago

once I have time will be improving the unit tests.