(Feature Request) Better sprite communication: "Broadcast and wait" should run instantaneously inside no-refresh custom blocks

(This is a feature request.)

Since the introduction of custom blocks in Scratch 2.0, "run without screen refresh" custom blocks have had a strange "edge case" behavior revealing awkward internals of the Scratch programming language. It's easy to demonstrate, as in the below script:

demo-run-without-screen-refresh

Try replicating this and running the project. You'll see the project runs extremely sluggishly; Scratch is normally supposed to run at 30 frames per second, but here it runs at only 1 frame per 500ms! After 10 frames (= 5s), the loop completes, and the sprite says "done".

I believe it's fairly safe to say that replacing this odd behavior with a more useful feature would not harm any backwards compatibility, because theoretically zero projects depend on this arbitrary behavior.

So, the proposal: Receiver scripts across all targets (stage/sprites/clones) should be found and then run to completion instantaneously, returning execution to the custom block within the same frame.

Arguments for this:

It's compatible. Like mentioned, no project should depend on the arbitrary existing behavior!
It's intuitive, at least in contrast to the existing behavior: why SHOULD this "broadcast and wait" cause such a long delay? There is a logical answer, but it requires an understanding of the Scratch internal codebase, and that isn't useful education for learners!
It's educational. Learners will often find out (or be taught) about the broadcast blocks, which enable communication between sprites, clones, and the stage. For programmers with experience outside of Scratch, it's easy to draw connections to object oriented programming: communication across objects is the foundations of OOP! However, Scratch is limited in practical applications of this style of OOP, because currently, all cross-sprite communication requires waiting one frame. This prevents learners from really making use of their new knowledge, at least until they move to languages beyond Scratch.

Arguments against:

It may take some time to implement. Wouldn't be a high priority issue if dev team members are busy.
It's niche. However, maybe not so much as I'll argue below. (And a niche added feature has much less risk of community confusion or added requests than a big one.)

Finally, I'll say I believe this suggestion to be in the spirit of the existing Scratch language design, because Scratch provides low-level tools, such as clones, variables, and lists, which users may develop more powerful abstractions on top of, such as first class lists/sprites, pointers, and dictionaries. Largely, Scratch does not put arbitrary obstacles in front of this: the basic concepts it provides are very powerful, after all!

However, the current behavior is not only confusing and altogether impractical, it is also limiting. It could be replaced with a much more powerful (albeit very simple!) behavior, and this would enable Scratchers to create all kinds of interactions between sprites, rather than being forced to either wait one frame (not good for most complex programs, let alone games that require fast sprite interaction) or simply not communicate between sprites at all (relegating all data to lists, which is its own powerful abstraction, but it is no good to lock a learning coder into one programming style instead of giving them the options to explore).

Like I said, this is a niche request, so I wouldn't expect it to be prioritized! However, if it were discussed within the team and were given the green light, I'd be happy to have a go at implementing it. (I don't think it would be too complicated to put together, honestly.) It could be a really powerful addition to the language, without doing anything to overcomplicate Scratch's design - only giving the user more options to explore as they learn and experiment.

Questions:

does it work with "switch backdrop to and wait" too?
what if the broadcast yields? (the "freezing" occurs because of Scratch's strange behavior with yield.)
what if the broadcast spawns a new broadcast?

does it work with "switch backdrop to and wait" too?

Makes sense to, yeah! I forgot about that block but yes, it would make sense to implement that too.

Keeping the code relatively general (to support both "and wait" blocks) would be a good thing to keep in mind while writing an implementation.

what if the broadcast yields? (the "freezing" occurs because of Scratch's strange behavior with yield.)

The broadcast would be run like an extension of the blocks in the run-without-screen-refresh: a "repeat 10" would be the same as repeat 10 regularly behaves inside a run w/o screen refresh (i.e. all in one frame).

This means that if the receiver contains a "forever" block (or similar infinite loops), we would still run into the 500ms limit. But the purpose of this suggestion isn't to avoid running into that limit in all cases; it's just to make the "broadcast and wait" block not necessarily reach the 500ms frame limit, and instead behave like a direct extension of the custom block.

(If you want to "launch" a broadcast to be run on the next frame & outside of the "run without screen fresh" context, you can still do that the way you always would, i.e. with the "broadcast" (no wait) block.)

what if the broadcast spawns a new broadcast?

Since the blocks in the receiving script are acting as a direct continuation of the custom block, they would behave the same way: "broadcast and wait" would execute the receiving scripts and return back to the sending script within the same frame.

IMO, this feels intuitive, since the wording of "run without screen refresh" implies everything it does will be run without a single frame delay, so it makes sense that the scripts receiving broadcasts would also run to completion without screen refresh.

(If a receiving script starts a new broadcast without "and wait", then that broadcast would be delayed until the next frame and would not pause the current frame's execution, like normal.)

Edit: an edge case to consider is the behavior where a broadcast targeting a receiver which is already running will cancel that receiver script, and restart it from beginning. I believe in the context of this suggestion such scripts would be pretty uncommon, since receivers would probably only be activated by run w/o screen refresh blocks in the first place (and thus would never be running at the end of a frame), but they should still be considered.

Actually, they are pretty common if you try to do recursive broadcasts. This is important to consider for OOP-style learning, because recursive broadcasts could be very common there.

One option is to simply use the current behavior, meaning "layered" (recursive) broadcasts would not be supported, except for tail-calls (i.e. a broadcast block at the end of a "when I receive" script, which are not actually recursive).

Another option is to treat the receiving scripts as a more literal continuation to the custom block code, meaning it would simply execute the contained blocks as though they were part of the current script. I believe this is (theoretically) the easiest way to implement. It would support "layered" broadcasts, where two instances of the receiver may be running at once (but all will still, as long as they don't hit the 500ms limit, complete within the same frame, finally passing control back to the original run-without-screen-refresh custom block). In terms of the edge case I mentioned earlier, this would also not interrupt (cancel) a receiver which was running before the custom block was called.

In case this issue gets prioritized or reviewed sooner or later, here's a sketch of the implementation details.

API:
- The API for starting hat blocks, used by broadcastAndWait and switchBackdropAndWait, is BlockUtil.startHats. (This defers to Runtime.startHats.) A minimal implementation shouldn't involve duplicating any of this code, so if possible, we should be working with the existing return value of BlockUtil.startHats.
- There's currently much duplication of logic between broadcastAndWait and switchBackdropAndWait. In fact, aside from a different call to startHats (compare helper function _setBackdrop vs. the first lines of broadcastAndWait), these two functions are exactly the same logic. To avoid adding more duplicate code, and reduce existing duplicate code, a new API should be added; I would suggest either startHatsAndWait or waitForThreads.
  - startHatsAndWait passes its arguments to startHats and then runs the waiting logic.
  - waitForThreads accepts the return value of startHats and runs the waiting logic on it.
  - startHatsAndWait feels more natural/"Scratch-y", but would necessitate reworking the _setBackdrop helper function a little, since at the moment it runs startHats itself. Probably the easiest change would be to add an optWait boolean argument, but _setBackdrop is already pretty complicated.
  - Both are relatively simple API changes to implement and use, regardless.
- See below (2.) for details on a new argument for both Runtime.startHats and BlockUtil.startHats, optForceNewThread.
Changes to startHats:
- As mentioned in the last comment (see quote below), recursive broadcasts are powerful and a direct analogue for OOP-inspired communication between sprites, which is the goal here.
- It would support "layered" broadcasts, where two instances of the receiver may be running at once (but all will still, as long as they don't hit the 500ms limit, complete within the same frame, finally passing control back to the original run-without-screen-refresh custom block).
- The current implementation of Runtime.startHats has two possible branches, depending on metadata for the hat block in question. We're concerned with the case where the flag hatMeta.restartExistingThreads is set. Here, threads which are currently running are restarted. (When the flag is not set, as in the case of "when key pressed" for example, hats which already have a corresponding thread will simply be skipped.)
- In theory, this is to prevent multiple threads of the same hat from running at the same time. While this is true, please note it is a user-facing design decision, not due to a technical limitation. In fact, there's already an exception to this general rule as precedent: as a comment in the above code writes, "stack click threads and hat threads can coexist". Check out this project for a demo.
- So, as requirement for recursive, "layered" broadcasts, we will need to implement a new flag to bypass these checks and always start a new thread, regardless of if there is or is not an existing thread. Since whether or not to enable this flag will be determined by the caller, I suggest a new optional argument, optForceNewThread.
- This would need to be added to both Runtime.startHats and BlockUtil.startHats, so that the caller of startHats may choose whether to specify the flag. (It is important we don't set it when using "broadcast" (no "and wait", for example, because "broadcast" on its own can be intentionally used to restart receivers.)
- All in all, this is a fairly simple addition with existing, demonstrable precedent (meaning it's harmless, too).
The "run without screen refresh" of instantaneous broadcasts:
- The technical term for "run without screen refresh" is warp mode.
- When a custom block labeled "run without screen refresh" is run, it sets the warpMode flag on the newly added stack frame.
- The warpMode flag is bubbled into newly pushed stack frames too.
- Newly started threads do not inherit warpMode. To counteract this, we'll simply act like "run without screen refresh" blocks do: we'll bubble it ourselves. This should be done as part of either BlockUtility.startHats or BlockUtility.startHatsAndWait/BlockUtility.waitForThreads (see below for reasoning).
- In warp mode, yielded blocks are re-executed immediately.
- "Break-out points" exist to keep warp-mode threads from going into infinite (or exceptionally long) loops and hanging the browser (causing the user, be they project creator or viewer, to lose progress). It's imperative we don't accidentally create a new way to reach this infinite loop.
  - In general, a "break-out point" is defined as returning to a yield once a timer has passed 500 milliseconds of execution time. The timer, warpTimer, is stored on the thread, not the stack frame.
  - A break-out point exists after executing any looping block ("repeat", "forever", etc).
  - A break-out point exists when starting a procedure call.
- In the most basic form, it is theoretically possible to skip both these breakout points; see this hypothetical "demo" project. (Tl;dr: "when I receive message1, broadcast message1 and wait".)
- Newly started threads do not inherit a warpTimer. This is problematic! We can't know when to break out if we don't have access to a shared warp timer.
- We could try making a change to _pushThread to accept an optional warpTimer to inherit, but its caller and public API, Runtime.startHats, doesn't have access to a thread or warp timer. We should avoid making (additional) changes to an existing API if possible.
- Instead, we'll manually copy over warpTimer from the only point where we would be passing a warp timer in the first place: BlockUtility.startHats. Or, alternatively, BlockUtility.startHatsAndWait or BlockUtility.waitForThreads, depending on which one is implemented.
  - Pseudocode:
```
result.forEach(addedThread => {
addedThread.warpTimer = callerThread.warpTimer;
});
```
- We still need to implement a new break point. This as well should be part of startHatsAndWait or waitForThreads.
- Newly started threads don't execute until the next yield. Check out this project for a demo. This is an issue because even if warpMode and warpTimer are appropriately provided to the newly created thread, it still won't do anything within that frame unless it is actually executed.
- However, we cannot arbitrarily change this behavior. Existing Scratch projects depend on the current timing - but only when not in warp mode.
- So, once again as part of startHatsAndWait or waitForThreads, we should also execute each of the newly created threads, but only if in warp mode.

To wrap this all up, here is a semi-pseudocode draft of what the startHatsAndWait and waitForThreads functions (which share a majority of their code - again, only one of them would be implemented in the end) would look like:

function startHatsAndWait(requestedHat, optMatchFields, optTarget) {
    // If in warp mode, pass a flag to force started hats to create new threads.
    const warpMode = this.peekStackFrame().warpMode;
    const threads = this.startHats(requestedHat, optMatchFields, optTarget, warpMode);

    /* ... (see below) */
}

function waitForThreads(threads) {
    const warpMode = this.peekStackFrame().warpMode;

    /* ... (see below) */
}

/* (Common code:) */

if (warpMode) {
    // Make threads inherit warpMode and warpTimer if they haven't already.
    for (let i = 0; i < threads.length; i++) {
        if (!threads[i].warpTimer) {
            threads[i].peekStackFrame().warpMode = true;
            threads[i].warpTimer = this.thread.warpTimer;
        }
    }
    // Don't step the threads once the timer is up.
    if (this.thread.warpTimer.timeElapsed() > Sequencer.WARP_TIME) {
        return;
    }
    // Step threads one by one in warp mode. They will execute until completion, or until
    // the inherited warpTimer is up.
    for (let i = 0; i < threads.length; i++) {
        this.sequencer.stepThread(threads[i]);
    }
} else {
    /* (The usual logic for broadcastAndWait/switchBackdropAndWait would go here. */
}

As described in detail above, here are the additional necessary changes, in summary:

Add optForceNewThread argument to Runtime.startHats, plus the relevant logic.
Add optForceNewThread argument to BlockUtil.startHats, and pass it to Runtime.startHats.
If we go with waitForThreads, in all appropriate places (broadcastAndWait and _setBackdrop), make sure to pass this.peekStackFrame().warpMode to startHats as the new fourth option, optForceNewThread.
If we go with startHatsAndWait, refactor Scratch3LooksBlocks._setBackdrop to take an optWait flag argument, and call either startHatsAndWait or startHats depending on its value.

Hopefully this should act as a solid basis for implementing this suggestion, sooner or later, if the interest is there.

Thanks to @apple502j for the feedback questions in https://github.com/LLK/scratch-vm/issues/2834#issuecomment-766193588, which helped solidify these implementation notes!

Absolutely brilliant! 😃 I would love to see this added to Scratch. Replace a buggy behavior with an intuitive and useful one that enables Scratchers to learn more about programming. I see no downside (other than the implementation effort).

Surprisingly, you can already perform sprite communication within a single frame as long as it's not recursive. This surprised me! I thought there'd be a one frame delay between a broadcast and the script running like there is in Scratch 1.4, but clearly this was changed at some point.

This means you can stuff all the 'when I receive' code into another no-refresh block and it will run immediately. You can get this behaviour, you just have to work harder for it. In that case, I think it would be better for the 'broadcast and wait' inside a no-refresh block to be able to refresh normally, as should also happen with 'wait', 'ask' and anything else that makes it pretty clear the user wants a refresh regardless of what the VM thinks.

Now, for more niche behaviour, I think there's an argument to be made that 'broadcast and wait' should always support non-tail recursion even outside of no-refresh, but that's an issue for another time. 😄

Hey @rdococ, nice to see your feedback on some older suggestions :)

This is something I was investigating a little while ago also! I created a couple projects which demonstrate capabilities of "instant" broadcasts in Scratch: Instant Broadcasts w/ Easy Rendering and Fast Broadcast and Wait OOP demo. (A year and a half ago apparently, but it feels more recent. 🥞)

I haven't looked deeply into behavior with recursion as it relates to running multiple of the same broadcast per frame — but you should note that broadcasts just don't support recursion at all in the first place. That is, if you broadcast a message whose "when I receive" scripts are already running, the execution of those scripts will be completely cancelled and replaced with a fresh execution. If you're unfamiliar with this behavior, I discuss it in detail here: https://github.com/leopard-js/leopard/issues/106#issuecomment-1464993510

The trick with "broadcast and wait", and indeed with any "wait one tick" behavior, is that one Scratch tick is not necessarily equal to one Scratch frame. Scratch only pauses execution to render a new frame (which is timed to a 1/30th-second interval) if any blocks executed within the previous tick caused changes which should be reflected on-screen. This includes any motion or looks blocks, for example, but also includes, perhaps counterintuitively, "run without screen refresh" blocks which themselves run motion/looks blocks. ("Run without screen refresh" means "don't relieve this tick's execution until the entirety of this block is completely evaluated", not "suppress screen refresh requests".)

So, if no blocks executed within a given tick request a screen refresh, that tick leads immediately into the next tick — no screen refresh and no 1/30th-second delay. Accordingly, "broadcast and wait" itself never causes a delay; rather, its receiver scripts are always delayed until the next tick, but if no blocks requested a screen refresh, that tick will come up immediately, rather than being delayed. With a little cleverness it's doable to delay a screen refresh until absolutely required, and not very complicated as long as you don't need (immediate) access to screen-sensing primitives such as "touching (sprite)" or "touching (color)". That's the behavior I was making use of in my two linked projects.

The awkwardness of integrating this into existing projects is that ANY blocks which affect the screen have to be deliberately delayed until you're certain everything you want to do this frame is done. You also only have 1/30th of a second to work with, and run into really bad screen tearing-like issues if that 30th of a second passes before you're ready to display, as execution is forcibly paused. (It's a very similar effect to how early 8- and 16-bit games would slow down if there were too many sprites on screen at a time!)

Modifying how the tick system works to accommodate instant broadcasts would be complicated to say the least, but my suggestion here is thankfully still workable to implement! The root of it is to, instead, bypass the tick-cycling system entirely, as "run without screen refresh" already does. It would just fully evaluate a new nested context for each "when I receive" script, with basically identical behavior as how you can nest custom blocks inside a "run without screen refresh" custom block's definition already. The groundwork is already there; this suggestion would just mean replicating that behavior in "broadcast and wait" when it's put inside a "run without screen refresh" custom block.

This means you can stuff all the 'when I receive' code into another no-refresh block and it will run immediately. You can get this behaviour, you just have to work harder for it. In that case, I think it would be better for the 'broadcast and wait' inside a no-refresh block to be able to refresh normally, as should also happen with 'wait', 'ask' and anything else that makes it pretty clear the user wants a refresh regardless of what the VM thinks.

Aha! Here's the trick. You say "you can (do a specific form of waiting) in a no-refresh block and it will run immediately", but... that's not true, because that's not what "run without screen refresh" does, as I mentioned above. It simply halts execution of other scripts until the custom block execution finishes. There's a 500 millisecond timeout so that projects which put any kind of waiting behavior in a "run without screen refresh" don't just break the Scratch webpage entirely! Try it:

when flag clicked: reset timer, run custom block, say timer; custom block definition: broadcast message1 and wait; when I receive message 1: (empty script)

The whole webpage freezes for half a second, and the sprite says 0.50 (showing half a second passed).

So yes, it's basically doable to implement instantaneous broadcasts as-is, but it can be tricky to figure out and especially to integrate into existing projects. There are performance concerns, as failing to complete all your behavior within 1/30th of a second causes visual and speed issues (which are potentially very jarring), and unlike an old SNES, all computers perform at different speeds, and a project developer can't accurately predict just how much work gets done in that time (without performing their own profiling inside the project and constraining behavior to fit estimates).

It's not an intuitive approach, so while the ceiling is technically higher than I suspected when I created this suggestion, the floor — within that context — is lost way up in space, and my suggestion aims to make the methods of getting similar behavior both more reliable and far more accessible :)

Now, for more niche behaviour, I think there's an argument to be made that 'broadcast and wait' should always support non-tail recursion even outside of no-refresh, but that's an issue for another time. 😄

I'd love this LOL. It's definitely impossible to implement in Scratch as-is because, apart from custom block definitions, one script simply can't have two "threads" operating with its blocks at once, and would be a breaking change since canceling broadcasts is a deliberate (mis)feature which has been built upon by lots of Scratchers (consciously or not). But it's a nice idea.

In my suggestion, no-refresh, non-tail recursion would be free because 1) it bypasses the tick system entirely and hooks into the same existing behavior as non-tail recursive custom blocks, and 2) it wouldn't break existing projects BECAUSE... insofar as projects don't depend on freezing up an entire computer for 0.5 seconds with the "broadcast and wait" block in particular, the existing behavior is broken! ✨

(Full disclosure: I apparently left some detailed implementation notes earlier, before I learned about the existing method for instant broadcasts. ~~I haven't peer-reviewed my own notes today, so I don't know if they're fully in line with my less-detailed implementation discussion in this comment. 📦~~ Edit: If interested in implementing, still go read past me's notes! I looked them over and they're basically in line with my comment today, only with much more concrete details, and a cleaner, more general-purpose implementation to boot. Woo hoo!)

scratchfoundation / scratch-vm

(Feature Request) Better sprite communication: "Broadcast and wait" should run instantaneously inside no-refresh custom blocks #2834