Closed afparsons closed 2 years ago
Hi Andrew,
I wouldn't trust my unit tests, I haven't updated them properly in a while and I have had it on my radar to fix them up a bit more properly. As I was the only person having a bit of trouble with those tests I did not prioritize fixing them so sorry for that :D.
Well done I am really liking the observer design pattern but I don't yet understand the idea behind it but I think it would be a great improvement to the framework! As the changes you have made are quite large in number we could schedule a call through Discord or something to go through them would you be up for that? (My time zone is CEST and I am available only after my office hours so 17:00 to 23:00 and all day on the weekends of course ;) ) Perhaps you could also elaborate a bit more on what you wish the change further :).
Edit: I have made a official branch to further experiment with adding the observer pattern into QuickPotato :)
I don't yet understand the idea behind it
Sure; I probably haven't explained my idea yet. I don't have time at this moment to go into detail, but here's an overview.
we could schedule a call
I'll get back to you about setting up a specific time. I have a busy week ahead of me :sweat:
Presently, QuickPotato runs in the following manner:
In all of the workflows you were imagining, step 3 seems perfectly reasonable. But what if a programmer does not want to write the results to the database? Or what if they want to perform multiple actions with the profiled data, including writing to the database (and they aren't too concerned about overhead :slightly_smiling_face: )? For example, they might want to let another function know about that test_id
.
The observer pattern allows for the following:
test_id
to a completely different function_
e. et ceteraThis grants a programmer far more flexibility.
For example, I've attached a LoggableInterpreter
to my project. This is essentially the SimpleInterpreter
from my branch, but with logging.info
instead of print
. (the total_response_time
is wrong... I'm still working on that):
I am focused on trying to get something working with Dagster. I know you're probably indifferent about my desired use case, but I'll nonetheless keep documenting my progress:
~Dagster's execute_pipeline
requires a Union[PipelineDefinition, IPipeline]
as input. Source code~
~This means that if I were to wrap @pipeline
, then the @PerformanceBreakpoint
decorator would have to return a PipelineDefinition
.~
@PerformanceBreakpoint(...)
@pipeline(...)
def my_cool_dagster_pipeline(...):
...
~But I am afraid that wrapping the @pipeline
will just profile the pipeline construction, which is not what I want.~
~This means that I need the @pipeline
to wrap @PerformanceBreakpoint
.~
~Here is the source code for the pipeline
decorator. I'm trying to figure out how to proceed.~
Edit: after looking through Dagster's source code some more, I am pretty sure what I am trying to achieve is impossible. I think I will just have to wait until they add some kind of official implementation.
once I have time will be improving the unit tests.
Hi Joey,
I have implemented my ideas on the
observer_pattern
branch of my fork. You can examine the diff here, although be warned that it is rather large.I'm most confident in the changes I made to
QuickPotato.profiling.intrusive
andQuickPotato.profiling.interpreters
. I'm least confident in any of the changes I made to the other test infrastructure.If I run
test_pt_boundary_testing
, all tests pass.If I run
test_pt_visualizations
, the test fails, although I'm not sure it was every supposed to succeed in the first place, asFlameGraph()
does not receive atest_id
.I'm most confused about the
test_pt_regression_testing
. If I run the tests again and again and again without making any changes to the code, I get different results. Observe the following screenshots, and note that they were all taken successively with no changes to code.I'm confused about why this happens. Is this expected behavior, or have I messed something up?
Edit: In case you are confused about how my
PerformanceBreakpoint
decorator should be used, see the following example:~I've not tested the
execution_wrapper
parameter yet, but I think I need to use that with Dagster. When I usedJoeyHendricks/QuickPotato:master
with Dagster:~None
for every column of theperformance_statistics
database.~~Based on that experience, I think I will need to respectively pass the following to
execution_wrapper
:~execute_pipeline
~execute_solid
~Edit 2: re Dagster Nope. I'm going to have to experiment a bit more with decorator voodoo.