Meta - Performance Optimization

scratchfoundation / scratch-vm

Virtual Machine used to represent, run, and maintain the state of programs for Scratch 3.0

http://scratchfoundation.github.io/scratch-vm/

BSD 3-Clause "New" or "Revised" License

1.21k stars 1.51k forks source link

Meta - Performance Optimization #391

Closed thisandagain closed 6 years ago

thisandagain commented 7 years ago

Scratch 3.0 should execute projects as fast or faster than Scratch 2.0. In some places performance is already superior, but we can use this issue to track places where we are seeing bottlenecks. Once a specific performance improvement has been identified an individual issue should be filed and referenced here (meta issue).

Issues

[ ] Procedures (getProcedureParamNames, getProcedureDefinition, etc.)
- [ ] https://llk.github.io/scratch-vm/#141490356
- [ ] http://llk.github.io/scratch-vm/#97272135
[x] Procedures ("Run without screen refresh")
- [x] Example: https://llk.github.io/scratch-vm/#141490356
- [x] PR: https://github.com/LLK/scratch-vm/pull/406
[ ] Variable getters / setters
- [ ] http://llk.github.io/scratch-vm/#24115792
[ ] Step thread
- [ ] http://llk.github.io/scratch-vm/#26886797
[ ] Touching
- [ ] (Needs minimal example case)
[ ] Turbo mode

Stress Tests

luxaritas commented 7 years ago

I know many resource-intensive projects use Phosphorus to compile to native JS. Is it possible to get near (or better than?) the performance of that?

TheBrokenRail commented 7 years ago

Here is another example project, runs better but still pretty slow. https://llk.github.io/scratch-vm/#67575180

thisandagain commented 7 years ago

@LFP6 It's hard to say for sure, but we should be able to get close.

thisandagain commented 7 years ago

Looks like one of the major issues is actually the "Run without Screen Refresh" option for procedures which is not fully implemented yet. I built a very simple example project that should be able to easily run at 30/60 fps but currently runs at 2-6 fps on my laptop: https://llk.github.io/scratch-vm/#141490356

Whenever possible we should try to build the simplest possible to demo to show a performance optimization area. While the giant / complex projects are great for stress testing, they aren't particularly helpful when trying to track some of the specific issues down.

griffpatch commented 7 years ago

I don't think it's specifically down to the custom blocks.

Here's a simple project that loops for a second, incrementing a variable, then it outputs the number of times it looped:

http://llk.github.io/scratch-vm/#24115792 https://scratch.mit.edu/projects/24115792/ https://phosphorus.github.io/#24115792

In Scratch 3 it reports only 45,000 per second. In Scratch 2 it reports 500,000 iterations per second. In phosphorus it reports 2,900,000 million iterations per second.

There are no custom blocks in the looping code, I think the basic variable handling is just very much slower at present in Scratch 3.0 than in 2.0 at present.

Do you do any variable name caching in 3 yet? Scratch 2 keeps hold of variables once they've been looked up once.

Once you bring basic variable handling up to speed, all complex projects will show a real boost in performance, roughly 10 times perhaps going on those figures?

Griffpatch

On 24 January 2017 at 01:54, Andrew Sliwinski notifications@github.com wrote:

Looks like one of the major issues is actually the "Run without Screen Refresh" option for procedures which is not fully implemented yet. Very simple example project that should be able to easily run at 30/60 fps but currently runs at 2-6 fps on my laptop: https://llk.github.io/scratch-vm/#141490356

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/LLK/scratch-vm/issues/391#issuecomment-274677956, or mute the thread https://github.com/notifications/unsubscribe-auth/AGbNvgR6shaUweE6xY88KfpXaHjtUsGPks5rVVnQgaJpZM4Lqdzb .

griffpatch commented 7 years ago

Ok, so I realised that I should probably not jump to conclusions, so I tried to mitigate the change var from the program to test more generic runtime performance. This one now spends most of the time just going round an empty loop.

https://scratch.mit.edu/projects/26886797/ http://llk.github.io/scratch-vm/#26886797 https://phosphorus.github.io/#26886797

In Scratch 3 it reports only 1,200 iterations per second. In Scratch 2 it reports 18,000 iterations per second. In phosphorus it reports 48,000 iterations per second.

With that in mind, I'd say it forget my over enthusiastic jump to conclusions about it being variable related (unless of course the repeat loop uses the same mechanism internally to keep count?). And the next guess is that the scratch 3 player just takes a lot longer to get from one execution step to the next? I was wondering how much of that time is invested in highlighting scratch blocks to show execution, etc...

On 24 January 2017 at 09:01, Andrew Griffin andy@griffpatch.co.uk wrote:

I don't think it's specifically down to the custom blocks.

Here's a simple project that loops for a second, incrementing a variable, then it outputs the number of times it looped:

http://llk.github.io/scratch-vm/#24115792 https://scratch.mit.edu/projects/24115792/ https://phosphorus.github.io/#24115792

In Scratch 3 it reports only 45,000 per second. In Scratch 2 it reports 500,000 iterations per second. In phosphorus it reports 2,900,000 million iterations per second.

There are no custom blocks in the looping code, I think the basic variable handling is just very much slower at present in Scratch 3.0 than in 2.0 at present.

Do you do any variable name caching in 3 yet? Scratch 2 keeps hold of variables once they've been looked up once.

Once you bring basic variable handling up to speed, all complex projects will show a real boost in performance, roughly 10 times perhaps going on those figures?

Griffpatch

On 24 January 2017 at 01:54, Andrew Sliwinski notifications@github.com wrote:

Looks like one of the major issues is actually the "Run without Screen Refresh" option for procedures which is not fully implemented yet. Very simple example project that should be able to easily run at 30/60 fps but currently runs at 2-6 fps on my laptop: https://llk.github.io/scratch-vm/#141490356

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/LLK/scratch-vm/issues/391#issuecomment-274677956, or mute the thread https://github.com/notifications/unsubscribe-auth/AGbNvgR6shaUweE6xY88KfpXaHjtUsGPks5rVVnQgaJpZM4Lqdzb .

thisandagain commented 7 years ago

Yup! Variable lookups do look like they could use some help, but so do procedure lookups, and (as you note) just thread stepping in general.

griffpatch commented 7 years ago

Interesting that when profiling the JavaScript execution, there are a lot of "Not optimized" messages, like so:

[image: Inline images 1]

Warning: Not optimized: Function is being debugged Function: Thread.pushStack @ thread.js:97

I don't know what the "Function is being debugged" javascript optimisation bailout means. From what I read up it implies there is debugger code in the javascript, but I don't see any references to debugger script? It's a funny place to be taking so much time overall just pushing to the stack - I guess it does do it an awful lot ;)

On 24 January 2017 at 13:28, Andrew Sliwinski notifications@github.com wrote:

Yup! Variable lookups do look like they could use some help, but so do procedure lookups, and (as you note) just thread stepping in general.

On Tue, Jan 24, 2017 at 6:10 AM griffpatch notifications@github.com wrote:

Ok, so I realised that I should probably not jump to conclusions, so I tried to mitigate the change var from the program to test more generic runtime performance. This one now spends most of the time just going round an empty loop.

https://scratch.mit.edu/projects/26886797/ http://llk.github.io/scratch-vm/#26886797 https://phosphorus.github.io/#26886797

In Scratch 3 it reports only 1,200 iterations per second. In Scratch 2 it reports 18,000 iterations per second. In phosphorus it reports 48,000 iterations per second.

With that in mind, I'd say it forget my over enthusiastic jump to conclusions about it being variable related (unless of course the repeat loop uses the same mechanism internally to keep count?). And the next guess is that the scratch 3 player just takes a lot longer to get from one execution step to the next? I was wondering how much of that time is invested in highlighting scratch blocks to show execution, etc...

On 24 January 2017 at 09:01, Andrew Griffin andy@griffpatch.co.uk wrote:

I don't think it's specifically down to the custom blocks.

Here's a simple project that loops for a second, incrementing a variable, then it outputs the number of times it looped:

http://llk.github.io/scratch-vm/#24115792 https://scratch.mit.edu/projects/24115792/ https://phosphorus.github.io/#24115792

In Scratch 3 it reports only 45,000 per second. In Scratch 2 it reports 500,000 iterations per second. In phosphorus it reports 2,900,000 million iterations per second.

There are no custom blocks in the looping code, I think the basic variable handling is just very much slower at present in Scratch 3.0 than in 2.0 at present.

Do you do any variable name caching in 3 yet? Scratch 2 keeps hold of variables once they've been looked up once.

Once you bring basic variable handling up to speed, all complex projects will show a real boost in performance, roughly 10 times perhaps going on those figures?

Griffpatch

On 24 January 2017 at 01:54, Andrew Sliwinski < notifications@github.com> wrote:

Looks like one of the major issues is actually the "Run without Screen Refresh" option for procedures which is not fully implemented yet. Very simple example project that should be able to easily run at 30/60 fps but currently runs at 2-6 fps on my laptop: https://llk.github.io/scratch-vm/#141490356

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <https://github.com/LLK/scratch-vm/issues/391#issuecomment-274677956 , or mute the thread < https://github.com/notifications/unsubscribe-auth/ AGbNvgR6shaUweE6xY88KfpXaHjtUsGPks5rVVnQgaJpZM4Lqdzb

.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/LLK/scratch-vm/issues/391#issuecomment-274774832, or mute the thread https://github.com/notifications/unsubscribe-auth/AAtoeSb4GD_ PLYaLaKWdyoKfGF69kc9oks5rVdv5gaJpZM4Lqdzb .

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/LLK/scratch-vm/issues/391#issuecomment-274802205, or mute the thread https://github.com/notifications/unsubscribe-auth/AGbNvrAtemRA5Kz5DZMAw1NYHdAcJBHZks5rVfxjgaJpZM4Lqdzb .

thisandagain commented 7 years ago

From discussion with @cwillisf: We've gone back and forth on using requestAnimationFrame for thread stepping (see discussion here: https://github.com/LLK/scratch-vm/issues/208). After looking at the timeline & profiler it looks like we might want to investigate further now that we have more of the VM built out.

griffpatch commented 7 years ago

I've been playing around trying to find out why things run much slower in Scratch 3.

In doing so I've been finding plenty of small JavaScript optimisations that can speed things up a degree.

I've managed to boost general script execution time to be around 200% to 350% in my local build at present, but I've not yet stumbled across anything to boost it to the level of Scratch 2 (1,000% or more).

I'm not familiar with using github to contribute, what I was thinking is could I commit by changes to my own branch somehow so anyone who is interested could dip into it and see what I changed in case you want to do similar, or get some ideas as to what things may help?

I'm not sure you would be even thinking too much about performance right now?

All the best,

Griffpatch

On 24 January 2017 at 15:11, Andrew Sliwinski notifications@github.com wrote:

From discussion with @cwillisf https://github.com/cwillisf: We've gone back and forth on using requestAnimationFrame from thread stepping (see discussion here: #208 https://github.com/LLK/scratch-vm/issues/208). After looking at the timeline & profiler it looks like we might want to investigate further now that we have more of the VM built out.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/LLK/scratch-vm/issues/391#issuecomment-274831111, or mute the thread https://github.com/notifications/unsubscribe-auth/AGbNvix739s-TulYdqywzPc1V6I4LTAuks5rVhSMgaJpZM4Lqdzb .

thisandagain commented 7 years ago

@griffpatch That sounds fine as a first step. Thanks.

Performance isn't a top priority right now (we still have much to implement) but it's important to check-in every once and a while to make sure we aren't making architectural decisions that would prevent 3.0 from having better performance than the existing system. Thus far, I don't see any blockers and thus we probably won't prioritize a ton of work here until closer to launch.

griffpatch commented 7 years ago

I didn't think you would be worrying to much about optimising things yet. Some of the fixes are simple array index out of bounds checks that are currently happening a lot and slow things down in critical code.

Other changes are to remove un-required stack manipulations.

I've removed some constant browser compatibility tests in favour of a one off check.

I've removed a few dynamically created objects in favor of using prototype classes (direct swap).

Do i just need to click create branch or is that too closely attached to your official source?

Griffpatch

On 25 Jan 2017 4:27 p.m., "Andrew Sliwinski" notifications@github.com wrote:

@griffpatch https://github.com/griffpatch That sounds fine as a first step. Thanks.

Performance isn't a top priority right now (we still have much to implement) but it's important to check-in on every once and a while to make sure we aren't making architectural decisions that would prevent 3.0 from having better performance than the existing system. Thus far, I don't see any blockers and thus we probably won't prioritize a ton of work here until closer to launch.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/LLK/scratch-vm/issues/391#issuecomment-275156158, or mute the thread https://github.com/notifications/unsubscribe-auth/AGbNvkOyN8YAxA323AjdXk7LEk-OxXJ7ks5rV3fzgaJpZM4Lqdzb .

griffpatch commented 7 years ago

Oh, and I added an initial test of some procedure name parameter caching.

On 25 Jan 2017 4:50 p.m., "Andrew Griffin" andy@griffpatch.co.uk wrote:

I didn't think you would be worrying to much about optimising things yet. Some of the fixes are simple array index out of bounds checks that are currently happening a lot and slow things down in critical code.

Other changes are to remove un-required stack manipulations.

I've removed some constant browser compatibility tests in favour of a one off check.

I've removed a few dynamically created objects in favor of using prototype classes (direct swap).

Do i just need to click create branch or is that too closely attached to your official source?

Griffpatch

On 25 Jan 2017 4:27 p.m., "Andrew Sliwinski" notifications@github.com wrote:

@griffpatch https://github.com/griffpatch That sounds fine as a first step. Thanks.

Performance isn't a top priority right now (we still have much to implement) but it's important to check-in on every once and a while to make sure we aren't making architectural decisions that would prevent 3.0 from having better performance than the existing system. Thus far, I don't see any blockers and thus we probably won't prioritize a ton of work here until closer to launch.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/LLK/scratch-vm/issues/391#issuecomment-275156158, or mute the thread https://github.com/notifications/unsubscribe-auth/AGbNvkOyN8YAxA323AjdXk7LEk-OxXJ7ks5rV3fzgaJpZM4Lqdzb .

thisandagain commented 7 years ago

@griffpatch If you could push that branch to your fork (https://github.com/griffpatch/scratch-vm) it would be most appreciated.

griffpatch commented 7 years ago

Ok phew... done that. All my changes are committed to the branch: https://github.com/griffpatch/scratch-vm/

Thanks,

Griffpatch

On 25 January 2017 at 17:00, Andrew Sliwinski notifications@github.com wrote:

@griffpatch https://github.com/griffpatch If you could push that branch to your fork (https://github.com/griffpatch/scratch-vm) it would be most appreciated.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/LLK/scratch-vm/issues/391#issuecomment-275166212, or mute the thread https://github.com/notifications/unsubscribe-auth/AGbNvomRV15m-V3HrSCTsbgtryM9k_bfks5rV3_IgaJpZM4Lqdzb .

griffpatch commented 7 years ago

I've built a scratch 3 gh-pages for you to compare the difference in performance on my local fork.

This one is a quick one:

http://llk.github.io/scratch-vm/#141514080 https://griffpatch.github.io/scratch-vm/#141514080

This one is more impressive to watch ;) - But still nowhere near fast enough to pull it off like in Scratch 2.

http://llk.github.io/scratch-vm/#61991330 https://griffpatch.github.io/scratch-vm/#61991330

Is there anything further you'd like me to do with fork or have I done this correctly?

Thanks,

Griffpatch

On 25 January 2017 at 22:37, Andrew Griffin andy@griffpatch.co.uk wrote:

Ok phew... done that. All my changes are committed to the branch: https://github.com/griffpatch/scratch-vm/

Thanks,

Griffpatch

On 25 January 2017 at 17:00, Andrew Sliwinski notifications@github.com wrote:

@griffpatch https://github.com/griffpatch If you could push that branch to your fork (https://github.com/griffpatch/scratch-vm) it would be most appreciated.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/LLK/scratch-vm/issues/391#issuecomment-275166212, or mute the thread https://github.com/notifications/unsubscribe-auth/AGbNvomRV15m-V3HrSCTsbgtryM9k_bfks5rV3_IgaJpZM4Lqdzb .

thisandagain commented 7 years ago

@griffpatch Thanks for all the investigation! This looks good. I'll start to pull in some of your changes soon.

griffpatch commented 7 years ago

Hey there,

You will probably have noted that I went a bit mad today with trying to work out how to use github! I realised too late that my original fork should then have been branched to allow me to create individual pull requests for each issue fixed. I've sorted out separating some issues into their own branches and submitted pull requests for them (just ones that were marked for wanting help).

So did I do this right? Or would you prefer me not to be submitting pull requests and keep the issues in my fork, but with links to them via the comments on the commits?

Also, would you like me to split up my optimisation changes into separate branches too so they can be taken individually?

Griffpatch

On 26 January 2017 at 14:03, Andrew Sliwinski notifications@github.com wrote:

@griffpatch https://github.com/griffpatch Thanks for all the investigation! This looks good. I'll start to pull in some of your changes soon.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/LLK/scratch-vm/issues/391#issuecomment-275395157, or mute the thread https://github.com/notifications/unsubscribe-auth/AGbNvtMsKYfQo-Kz283gGvmk0KMTXqdkks5rWKfHgaJpZM4Lqdzb .

thisandagain commented 7 years ago

@griffpatch This is great! Individuals branches with separate PRs is always preferred. If it's not too much to ask, splitting up your optimization changes into separate PRs would be most helpful. 😄

griffpatch commented 7 years ago

I'm a bit worried that the current Scratch 3 architecture is not going to be able to be optimised to allow it to run scripts at the same pace as Scratch 2.

I've been looking through the code, refactoring where possible, conducting timing tests and googling for people opinions on the fastest approaches using closures, objects and native functions, but after extensive work I have not been able to get it to run much faster than just over twice the current build speed... and that's with quite some refactoring and added caching of bock inputs and custom blocks.

I'm sure twice as fast is good, but it's still only under 20% to 25% of the speed of scratch 2?

For your reference, here are the optimisations I've implemented, or tested:

Refactored the custom object (refereed to as 'util' that is passed into the blockFunction() during execute to be a prototype class (such that all it's methods are already defined)
Merged the handleReport closure into the prototype class described above to share it's internal member variables.
I removed the stack & stackframe arrays in favor of a linked list approach combining both together... I was rather underwhelmed by the effect sadly (The simple array was requiring lots of extra script to get out the top and parent items, whereas using an actual linked list meant that I just stored a pointer to the top item, so instead of saying "this.Stack[this.Stack.length-1]", I just use "this.Stack"... well... it may be a bit better but only incrementally.
I've tried pulling the 'constant' inputs from blocks into the parent execute method to reduce the number off push and pops of the stack, but that had no noticeable effect sadly... Perhaps if my scratch project has more inputs...
I cache the block arg structures (extra work is done for every block with inputs to get a list of the arg names minus the text parts, so I cache these)
I cache the custom block parameters & id - this does help quite a bit in reducing execution time - but not enough to make a huge inroad.

Simple things like just updating a variable take considerably more time, and all I've come to the conclusion that it's because javascript is just plain slow at deriving object lookups and then executing function calls? The scripts of Scratch 3 are beautiful in there calling structure, and it would be awesome if it was complied, or if javascript would just do a better job optimising, but it doesn't appear to be quite hitting the mark right now :/.

What is the current consensus of the Scratch team on how things are looking? The sprite costume handling and collisions are a work of art and super fast, and the block rendering looks fab, but I'm worried about how we might achieve the speed increases that everyone would hope for. Do you guys have any other tricks up your sleeves that I haven't considered?

Perhaps a deeper refactoring to remove the need for a closure / prototype class at all for the handleReport / util object? Not sure if that would help or is even easily possible due to the scoping requirements of the util object. Otherwise more caching of runtime data against the blocks? But this muddies the water.

Well, I've really poured myself into this, and I just wanted to ask these questions, I hope you don't mind.

All the best,

Griffpatch

On 27 January 2017 at 16:03, Andrew Sliwinski notifications@github.com wrote:

@griffpatch https://github.com/griffpatch This is great! Individuals branches with separate PRs is always preferred. If it's not too much to ask, splitting up your optimization changes into separate PRs would be most helpful. 😄

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/LLK/scratch-vm/issues/391#issuecomment-275701255, or mute the thread https://github.com/notifications/unsubscribe-auth/AGbNvqQ8pHWppILC-nSwd9pz3ZRMDtqdks5rWhVAgaJpZM4Lqdzb .

cwillisf commented 7 years ago

Thanks for all your work and your thoughts, @griffpatch! I agree with your concern about the current performance of Scratch 3.0 but I believe we'll be able to improve it quite a bit in the future.

In short, the team is aware that Scratch 3.0 isn't as fast as we'd like but we're focusing most of our time and effort on features right now. We may occasionally do some performance work when we notice a specific problem -- @rschamp did so in LLK/scratch-gui#61 today, for example -- but the priority right now is to get as much of Scratch 3.0 basically working as we can, as soon as we can. One way to think about that is this: if a large number of people are suddenly unable to run Scratch 2.0 for one reason or another, I'd rather offer them a version of Scratch 3.0 which can run every Scratch 2.0 project, but complex ones run slowly, than a version of Scratch 3.0 which is reliably fast but can't run all projects.

Another reason we're not yet focusing on performance is that as we implement new features we're likely to make occasional large changes to the internals of the VM and other modules, which might change the performance characteristics. For example, preparing the renderer for an external storage module led to a speedup in changing costumes (the old code had always been intended as temporary), and the recent work on the pen blocks led to noticing a way we could potentially reduce the impact of "sprite info" events. It's possible that we'll end up removing "sprite info" events entirely, though, so we haven't yet spent the time to implement that optimization.

All that said, performance is definitely important. On one hand, better performance means that you can run a more complex project (cough @griffpatch cough) on a given piece of hardware. On the other hand, better performance also means that a given project can run well on cheaper, older, or lower-powered hardware -- potentially opening Scratch to a larger audience.

Some time later this year I expect we'll be moving from large, sometimes disruptive changes to more detail-oriented work -- the 10% of the work that requires 90% of the effort -- and we'll probably put more effort toward performance at that time. We'll break out the profilers and the whiteboards and figure out if we have to cache this data or unpack that object, or rotate all of the VM's internals 90 degrees to the left... whatever we need to do in order to get great performance out of the new codebase. As we work toward that phase, we may also do smaller performance passes now and then to keep performance creep under control.

In the mean time, we really do appreciate help from you and others in the community. One of the best things about an active open-source community is that people like yourself and others can contribute to parts of the effort that we're not currently focusing on (like performance), or where we missed a detail (like pen alpha). It takes a village to raise Scratch 3.0!

griffpatch commented 7 years ago

Well thank you greatly for taking the time to reply at such length :).

That does confirm what I guess I knew, that you were focusing on getting things working fully first ;). Also of course by not optimising things now you leave the choice fully open to be more easily rearranged and optimised later on... With that in mind would it be best for any optimisations I make to be small in scale for the time being? I could always check in a branch with optimisation tests (larger scale) separately and without pull requests.

Thanks :)

On 2 Feb 2017 11:36 p.m., "Chris Willis-Ford" notifications@github.com wrote:

Thanks for all your work and your thoughts, @griffpatch https://github.com/griffpatch! I agree with your concern about the current performance of Scratch 3.0 but I believe we'll be able to improve it quite a bit in the future.

In short, the team is aware that Scratch 3.0 isn't as fast as we'd like but we're focusing most of our time and effort on features right now. We may occasionally do some performance work when we notice a specific problem -- @rschamp https://github.com/rschamp did so in LLK/scratch-gui#61 https://github.com/LLK/scratch-gui/pull/61 today, for example -- but the priority right now is to get as much of Scratch 3.0 basically working as we can, as soon as we can. One way to think about that is this: if a large number of people are suddenly unable to run Scratch 2.0 for one reason or another, I'd rather offer them a version of Scratch 3.0 which can run every Scratch 2.0 project, but complex ones run slowly, than a version of Scratch 3.0 which is reliably fast but can't run all projects.

Another reason we're not yet focusing on performance is that as we implement new features we're likely to make occasional large changes to the internals of the VM and other modules, which might change the performance characteristics. For example, preparing the renderer for an external storage module led to a speedup in changing costumes (the old code had always been intended as temporary), and the recent work on the pen blocks led to noticing a way we could potentially reduce the impact of "sprite info" events. It's possible that we'll end up removing "sprite info" events entirely, though, so we haven't yet spent the time to implement that optimization.

All that said, performance is definitely important. On one hand, better performance means that you can run a more complex project (cough @griffpatch https://github.com/griffpatch cough) on a given piece of hardware. On the other hand, better performance also means that a given project can run well on cheaper, older, or lower-powered hardware -- potentially opening Scratch to a larger audience.

Some time later this year I expect we'll be moving from large, sometimes disruptive changes to more detail-oriented work -- the 10% of the work that requires 90% of the effort -- and we'll probably put more effort toward performance at that time. We'll break out the profilers and the whiteboards and figure out if we have to cache this data or unpack that object, or rotate all of the VM's internals 90 degrees to the left... whatever we need to do in order to get great performance out of the new codebase. As we work toward that phase, we may also do smaller performance passes now and then to keep performance creep under control.

In the mean time, we really do appreciate help from you and others in the community. One of the best things about an active open-source community is that people like yourself and others can contribute to parts of the effort that we're not currently focusing on (like performance), or where we missed a detail (like pen alpha). It takes a village to raise Scratch 3.0!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/LLK/scratch-vm/issues/391#issuecomment-277119582, or mute the thread https://github.com/notifications/unsubscribe-auth/AGbNvpTNludQkjF5T9G1SyppTUZvidNgks5rYmhUgaJpZM4Lqdzb .

griffpatch commented 7 years ago

If you want to see something working pretty well, then try out this :)

https://griffpatch.github.io/scratch-vm/#105500895

To get this working I had to:

Fix "stop this script" form terminating threads
Fix case insensitive broadcasts
Fix the " " == 0 bug.
Implement sprite fencing (to stop them going offscreen)

The optimisations I've made also make the runtime engine roughly 3 times quicker now.

These changes are in my optimisation/reuseStackForNextBlock https://github.com/griffpatch/scratch-vm/tree/optimisation/reuseStackForNextBlock branch at present.

On 2 February 2017 at 23:47, Andrew Griffin andy@griffpatch.co.uk wrote:

Well thank you greatly for taking the time to reply at such length :).

That does confirm what I guess I knew, that you were focusing on getting things working fully first ;). Also of course by not optimising things now you leave the choice fully open to be more easily rearranged and optimised later on... With that in mind would it be best for any optimisations I make to be small in scale for the time being? I could always check in a branch with optimisation tests (larger scale) separately and without pull requests.

Thanks :)

On 2 Feb 2017 11:36 p.m., "Chris Willis-Ford" notifications@github.com wrote:

Thanks for all your work and your thoughts, @griffpatch https://github.com/griffpatch! I agree with your concern about the current performance of Scratch 3.0 but I believe we'll be able to improve it quite a bit in the future.

In short, the team is aware that Scratch 3.0 isn't as fast as we'd like but we're focusing most of our time and effort on features right now. We may occasionally do some performance work when we notice a specific problem -- @rschamp https://github.com/rschamp did so in LLK/scratch-gui#61 https://github.com/LLK/scratch-gui/pull/61 today, for example -- but the priority right now is to get as much of Scratch 3.0 basically working as we can, as soon as we can. One way to think about that is this: if a large number of people are suddenly unable to run Scratch 2.0 for one reason or another, I'd rather offer them a version of Scratch 3.0 which can run every Scratch 2.0 project, but complex ones run slowly, than a version of Scratch 3.0 which is reliably fast but can't run all projects.

Another reason we're not yet focusing on performance is that as we implement new features we're likely to make occasional large changes to the internals of the VM and other modules, which might change the performance characteristics. For example, preparing the renderer for an external storage module led to a speedup in changing costumes (the old code had always been intended as temporary), and the recent work on the pen blocks led to noticing a way we could potentially reduce the impact of "sprite info" events. It's possible that we'll end up removing "sprite info" events entirely, though, so we haven't yet spent the time to implement that optimization.

All that said, performance is definitely important. On one hand, better performance means that you can run a more complex project (cough @griffpatch https://github.com/griffpatch cough) on a given piece of hardware. On the other hand, better performance also means that a given project can run well on cheaper, older, or lower-powered hardware -- potentially opening Scratch to a larger audience.

Some time later this year I expect we'll be moving from large, sometimes disruptive changes to more detail-oriented work -- the 10% of the work that requires 90% of the effort -- and we'll probably put more effort toward performance at that time. We'll break out the profilers and the whiteboards and figure out if we have to cache this data or unpack that object, or rotate all of the VM's internals 90 degrees to the left... whatever we need to do in order to get great performance out of the new codebase. As we work toward that phase, we may also do smaller performance passes now and then to keep performance creep under control.

In the mean time, we really do appreciate help from you and others in the community. One of the best things about an active open-source community is that people like yourself and others can contribute to parts of the effort that we're not currently focusing on (like performance), or where we missed a detail (like pen alpha). It takes a village to raise Scratch 3.0!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/LLK/scratch-vm/issues/391#issuecomment-277119582, or mute the thread https://github.com/notifications/unsubscribe-auth/AGbNvpTNludQkjF5T9G1SyppTUZvidNgks5rYmhUgaJpZM4Lqdzb .

rschamp commented 7 years ago

@griffpatch That's impressive! The game is actually fun to play!! We'd welcome each of those bug fixes as (separate) PRs.

The logic for sprite fencing should be performed in scratch-render and used by the VM. Currently we have nothing, so if you like, please make a PR for a method containing your sprite fencing logic over on scratch-render.

It looks like we're going to have to adjust our linting rules for at least one of your optimizations. Currently we disallow using bare undefined, but the way you are using it to compare to declared variables is correct and faster.

griffpatch commented 7 years ago

It's a little hard to break up some of these optimisations as they are in some ways dependent on each other...? Or can I branch from each separate PR branch in a series of hops?

I am aware that it is not ready for pulling as the testing is not there... and i was wondering about that undefined test.

Also I did place a todo on the fencing code as I knew it was in the wrong place... Is there any way to setup my local scratch vm to take it's dependancies from my other forked projects rather than the scratch ones? Otherwise my local build will not work once I update multiple projects that are interrelated?

On 7 February 2017 at 15:42, Ray Schamp notifications@github.com wrote:

@griffpatch https://github.com/griffpatch That's impressive! The game is actually fun to play!! We'd welcome each of those bug fixes as (separate) PRs.

The logic for sprite fencing should be performed in scratch-render and used by the VM. Currently we have nothing, so if you like, please make a PR for a method containing your sprite fencing logic over on scratch-render.

It looks like we're going to have to adjust our linting rules for at least one of your optimizations. Currently we disallow using bare undefined, but the way you are using it to compare to declared variables is correct and faster.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/LLK/scratch-vm/issues/391#issuecomment-278038529, or mute the thread https://github.com/notifications/unsubscribe-auth/AGbNvqnZHarNupINulHNtOUsDOiajCA_ks5raJDdgaJpZM4Lqdzb .

rschamp commented 7 years ago

If there is not a way to split them, interdependent PRs are OK. It's just easier to figure out the history of things later down the line if each PR has a distinct purpose.

To use your fork, you need to use npm link:

> cd /path/to/scratch-render
> npm link
> cd /path/to/scratch-vm
> npm link scratch-render

This makes a symlink to your local copy of scratch-render in scratch-vm, so when scratch-vm tries to install or use scratch-render it will use what you have locally rather than installing from npm.

Note that this requires scratch-render to be built locally. You can either run npm run build after each change, or you can use npm run watch to have scratch-render continually build as you make changes — these changes should in turn be picked up by webpack in scratch-vm if you're running npm start or npm run watch there.

griffpatch commented 7 years ago

I've been taking a look over all the optimisations that I've made in my fastest build to see what was easy to bring in, but the answer is probably not a lot more.

The optimisations I have are as follows:

Refactor a number of the methods in "blocks.js" to accept a block object rather than a blockId (many places that call these already have access to the block object and removing the extra indirection increases the overall runtime performance by around 21%. This is the smallest of my potential refactors.
Reimplement threads.js to hold the stack & stackframe as linked lists instead of arrays. I'm not sure quite how much of a saving this is as I haven't got a build with this optimisation on it's own. It lends itself nicely to making further optimisations though as it naturally removes a lot of calls to the array as the object you are working with 'is' the head of the stack so no indirection is needed.
Caching - This makes a big difference to speed (caching block inputs as well as procedure parameters and blocks).
Removing the closures from the execution handling functions - I pulled these out into their own class to prevent constant creation of closures and recreation of the callbacks within them. I think this is a smart move and is easier to read too. Again, no timings for this yet.

I know all these optimisations help to speed up the whole runtime quite significantly (200% to 300%), but I don't want to put in more effort separating them out and creating PRs for them if they would be wasted at present. I can always do this later if you decide you'd like that. However, if you do want any of these particular avenues explored and a PR created then please let me know.

Thanks

Griffpatch

On 7 February 2017 at 16:01, Ray Schamp notifications@github.com wrote:

If there is not a way to split them, interdependent PRs are OK. It's just easier to figure out the history of things later down the line if each PR has a distinct purpose.

To use your fork, you need to use npm link:

cd /path/to/scratch-render npm link cd /path/to/scratch-vm npm link scratch-render

This makes a symlink to your local copy of scratch-render in scratch-vm, so when scratch-vm tries to install or use scratch-render it will use what you have locally rather than installing from npm.

Note that this requires scratch-render to be built locally. You can either run npm run build after each change, or you can use npm run watch to have scratch-render continually build as you make changes — these changes should in turn be picked up by webpack in scratch-vm if you're running npm start or npm run watch there.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/LLK/scratch-vm/issues/391#issuecomment-278044817, or mute the thread https://github.com/notifications/unsubscribe-auth/AGbNvosOhDmZFUf2HpEY6zOaqw4vTYnfks5raJVzgaJpZM4Lqdzb .

thisandagain commented 7 years ago

👍 Thanks for the overview @griffpatch . I think I'd be happy to review and land each of these. For caching I'm wondering if we should bring in a module to help reduce the boilerplate like Memoizee or LRU Cache.

griffpatch commented 7 years ago

I don't think the caching is very difficult, the only thing is making sure that 'any' invalidation of the scratch blocks invalidates the caching nicely :)

On 10 February 2017 at 13:48, Andrew Sliwinski notifications@github.com wrote:

👍 Thanks for the overview @griffpatch https://github.com/griffpatch . I think I'd be happy to review and land each of these. For caching I'm wondering if we should bring in a module to help reduce the boilerplate like Memoizee https://github.com/medikoo/memoizee or LRU Cache https://github.com/isaacs/node-lru-cache.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/LLK/scratch-vm/issues/391#issuecomment-278945839, or mute the thread https://github.com/notifications/unsubscribe-auth/AGbNviMfUMk7iLgGaLQuBc0_3B5BR0ybks5rbGrFgaJpZM4Lqdzb .

thisandagain commented 7 years ago

Not worried about the difficulty, worried about additional code that we need to maintain and test.

thisandagain commented 6 years ago

Closing as we are now tracking these issues in other areas / individual tickets. 🎉