is tween running condition has very high overhead

Scirra / Construct-bugs

Public bug report submissions for Construct 3 and Construct Animate. Please read the guidelines then click the 'Issues' tab to get started.

https://www.construct.net

107 stars 83 forks source link

is tween running condition has very high overhead #7224

Closed F3der1co closed 1 year ago

F3der1co commented 1 year ago

Problem description

the is tween playing condition has a high performance cost

Attach a .c3p

isTweenRunningPerf.zip

Steps to reproduce

run the project in debug mode
observe CPU usage of the two groups

Observed result

the tween group has a 10 times higher performance cost than the timer group

Expected result

similar performance (maybe slightly higher tween cost as it supports multiple tags, but not 10x)

More details

I use the is tween running condition a lot in my projects and these condition end up as a good chunk of the performance overhead in my events. The timer check seems to be much cheaper, I assume that parity is not possible as the tween behavior supports multiple tags while the timer doesn't, but 10x the cost seems a bit excessive.

Affected browsers/platforms:

First affected release:

System details

View details

PASTE HERE

AshleyScirra commented 1 year ago

A project like this does not prove that the performance overhead matters at all. It only shows a relative difference between two features. Internally these could work completely differently (and in fact they do in this case), but it's hard to tell from the outside, and in practice it probably does not matter in the slightest. I'd also point out that any benchmark with low measurements is probably running with the hardware in low-power/low-frequency mode, so the results could well be more like the difference between 0.1% and 1% on an unrealistic benchmark that does not represent what any real project does. So on the whole I would advise against filing issues about performance like this. It is probably irrelevant to the real-world performance of projects and investigating and improving it would largely be a waste of time.

F3der1co commented 1 year ago

Like it's already clearly stated in the report, these are reports I do because of overhead that felt weirdly high and hard to optimize in my real project and are not arbitrary measurement I do randomly to waste my time. This is just my way to report it so I can isolate it and crate a minimal project, the comparison is there so I can argue that a similar thing has been solved with better performance. I don't see how I can report issues I encounter like these any other way. If there is a better place or way to report stuff of this kind I would like to know.

AshleyScirra commented 1 year ago

Benchmarks like this are rarely representative of what real projects do. If you want us to optimize a real project, it is much more useful to share your actual project, so we can make our own measurements, see how it's really used and what needs work, as well as make sure any optimizations are targeted to real-world usage rather than artifical benchmarks.

F3der1co commented 1 year ago

But the real project has 1300+ events with usages of "is tween playing" and "on timer" etc. for wildly different use cases (this shows up as 10+ event groups with an idle of 0.1-0.4% cpu usage each due to this (on my gaming pc). Which does add up to be significant, especially on lower power hardware like laptops. I'm still willing to share my project privately via e-mail if you think this is the best way to handle this, but I feel like this will lead to a lot of wasted time just looking at it to understand the project and why a "is tween playing" is used here or there.

As a middle ground I have created an example project that showcases the cost while making sure there is a heavy load to avoid the cpu from downclocking while also trying to show it in the context of an example use case.

isTweenC&C.zip

The frustrating thing about this that I can see that there are ways to solve something similar (i.e. timer etc like I showed) with much lower or almost now performance cost. So I don't understand why you are using a, from my pov, worse way when you already know and have developed a better way. I literally cannot see the upside with the current situation on both of these issues. On the release page you can also see that performance improvements have a relatively high up vote amount, so people care about it performance.

AshleyScirra commented 1 year ago

I will take a look at your project if you send it via email. What I don't want to do is get in to micro-optimizing benchmarks without good evidence it affects real-world games. People could submit hundreds of issues like this one about all sorts of different features, and create an absolute mountain of work for us with little benefit to actual projects. Artificial benchmarks don't prove that the performance meaningfully affects real-world games, and unless I see your project all I can do is take your word for it, on a highly complex technical topic over which even experienced people frequently misinterpret or misattribute results. However if I have the actual project to optimise I can do a proper evaluation of how the engine is performing with tools like Chrome's profiler which can provide a deep insight to the time spent in the different places of Construct's engine - and I may well find that the cause you stated is not actually the cause of the overhead, or that something else is also introducing a high overhead which could easily be optimised, etc. That kind of thing is very common!

F3der1co commented 1 year ago

@AshleyScirra this check basically does the exact same (except from some edge cases where you are tweening a value over 0). There is definitely something wrong with the current implementation of is tween playing. Screenshot 2023-07-18 135242 I also made sure that multi tags still work with this workaround check, and they do! here is the modified project if you want to confirm: CheekyCheck.zip If you still don't think this needs to be improved I give up and just use this condition instead.

AshleyScirra commented 1 year ago

This is still an artificial benchmark. I will take a look at real-world projects, but optimizing benchmarks like this is likely a waste of our limited time. Working on this is falling in to the trap described in Optimisation: don't waste your time, which I think is exactly the rabbit hole you are falling in to. Don't waste your time. If your real-world project is not fast enough, send it to me, and I will take a look at possible engine optimisations.

F3der1co commented 1 year ago

My project runs amazingly well because I test and optimize my projects, rewrite heavy stuff in js and isolate condition that are especially heavy and try to avoid them or make sure I can use a cheaper condition before it. For me heaving more performance headroom means I can add more dynamic elements to the layouts and make them more lively and responsive. So for example lowering cpu performance by 0.5% means I can add 50 additional dynamic swaying plants to be ticking in the viewport. So for me it's not about if the game runs well enough, but how much I can do in the game as a whole or in a specific layout while still hitting my performance target.

It's just that this condition shows up like a sore thumb in event groups leading to them idling at 0.1-0.4% CPU. Which seems very curious to me as a lot of similar check run much faster. I think these examples have proven that this check runs 10-4 times slower than very similar checks. I am now just not using this condition as much anymore as I have found a substitute that I showed here. So for me this issue is fixed because I have the knowledge that the check is heavy and I have a solid alternative.

I could be more understanding if it would be clear to me why it is done this way, maybe there is a valid reason and fixing this would be a trade off. Again this is not a marginal overhead, we are talking about 400-1000% unnecessary (maybe it has a necessity I am unaware off) overhead of this condition.

F3der1co commented 1 year ago

just for completion sake and after some additional test I have to admit that when having only one instance in the layout and/or tons of tweens (and timers) running at the same time the measurements leads to some situation where is tween playing outperform this check : pick by comparison sprite.tween.progress <> 0 (using progress is obviously better than .value as it works with none value tweens and works even when tweening over 0, though it performs much worse than .value I guess it has to do some more complex math and also check if property tweens are running. So there definitely is some complexity here and it's not as cut and dry.

Still in all tests no matter if many or little instances, little or many tween/timers running "is timer playing" performs much much better even performing 40 times better in one case (many instances with many timers/tween running)