Results are inconsistent

maor-benami commented 3 years ago

Every time i run the test of the same implementation i get varying results.

does it depend on my machine resources’ being used? I mean if i run the test one time with some program that consumes cpu/memory in the background, and one time with no other programs opened i will get different results?

Thanks 🙏

ryansolid commented 3 years ago

Yes, the benchmarks are intensive and run locally so it definitely depends on your machines available resources. I often develop and then pause when running them. But to get what I consider stable runs I close everything down and leave the computer for a few minutes. Even things like thermal throttling can matter. I've definitely turned my Macbook Pro into a hotplate running these benchmark. My newer M1 doesn't have that issue mind you and tends to be a bit more stable (other than CPU slowdown is broken so I have to disable it). I've also had setups that were really variable regardless what I tried, like WSL on windows. I'd run tests over and over again individual until I could sort of average on the best common results.

In general though that's why I consider local testing mostly just for reference and wait until submission before I consider anything official. When writing articles and comparisons not based on the official results I often end up running tests several times until I can get stable runs. Even on different days. I've found when my computer is in the right state (no background processes) I get pretty consistent results. That's usually the best test, to run Vanilla a couple times and see how variable it is.

maor-benami commented 3 years ago

@ryansolid thank you Ryan!

Would you suggest to run it on dedicated machine maybe as a cloud function? I would like to get results for like 5-6 hand-picked libraries to see my implementation in relation to the others

ryansolid commented 3 years ago

Maybe yeah. Admittedly I've had a lot of patience with this thing and I run stuff so often I just take it for granted. But it's worth a shot. You'd need to check out how dedicated the resources are.

maor-benami commented 3 years ago

Yes. One change in the code can differ the results quite a lot so i find myself run the test every few minutes.

Thanks again 🙏

krausest commented 3 years ago

As Ryan said it's essential to have as little background tasks running as possible (I think this is even more important for modern CPUs for all those thermal budgets). I'm not sure if anything but a dedicated server can give you a predictable and constant CPU performance to run those benchmarks. The result table contains the value and an interval (the 95% confidence interval). High values for that interval are in indication that results aren't stable. Can you show some examples of the varying results (e.g. screenshots from the result table)? BTW I try to run vanillajs whenever I make a partial update for the table. If those results were off from the old results I'd re-run the whole update.

krausest / js-framework-benchmark

Results are inconsistent #926