Closed staabm closed 5 months ago
We could assert how often methods are called internally, but this would tighten the test to the impl
I was thinking of counting opcode executions via phpbench or similar, in future :thinking:
Or we could add blackfire performance tests with assertions on memory consumption/cpu time
assertions on memory consumption/cpu time
Unlikely runnable in CI due to missing stable workers :thinking:
Paging @dantleech for ideas :D
We benchmark every pull request to see if anything has changed in CPU/Memory consumption.
https://github.com/patchlevel/event-sourcing/pull/552#issuecomment-2018344313
At the moment we have not yet defined a threshold (diff in percent) to make the test fail. But we wanted to explore that too.
The trick is to use the same runner to run the base (old) benchmark and the new benchmark. Yes, there are still some differences because of the load on the runners, but the difference is still very small. It is important that there are not two different runners. Maybe this helps.
@DavidBadura I was really hoping there would be a deterministic instruction counter somewhere: CPUs are shared on runners, which make regression testing of benchmarks on shared infrastructure almost impossible :cold_sweat:
Paging @dantleech for ideas :D
I'm honestly not sure how much of an indicator nb. opcodes executed is to performance, as I guess some are more expensive than others, although it would be a solid metric for complexity and I guess a massive jump is likely going to be an issue, I might look into adding it to PHPBench at some point.
You can never be 100% confident about benchmarks as the universe is constantly changing state, but you can get a good level of confidence with statistics, probably even on a GH runner given enough samples and enough time. But PHPBench's analysis isn't optimal for that. I've had reasonable success using @DavidBadura 's method, failing if the result is > +10% f.e. and caught some regressions, although I'm not doing anything like that right now (even +100% is valuable)
I think what would be really useful is a service to aggregate the data and see it change over time, and such a service could also analyse the results vs. the enviornments and send alerts etc.
Before this PR
getParentClass
was unnecessary greedy resolving the whole parent hierarchy even though only a single level would be sufficientthe method stood out on a recent blackfire profile. notice the "explosion" of calls from ~1.500 to ~67.400 in