google / xls

XLS: Accelerated HW Synthesis
http://google.github.io/xls/
Apache License 2.0
1.2k stars 174 forks source link

[enhancement] Early return in DSLX functions #1567

Open mikex-oss opened 1 month ago

mikex-oss commented 1 month ago

What's hard to do? (limit 100 words)

Code can often be simpler and more readable (reduced nesting and/or eliminate extraneous computations) if you can early return with some guard statement.

For example:

https://github.com/google/xls/blob/052a2a3dfd71ea190a9862355c80346b44203fa8/xls/dslx/stdlib/apfloat.x#L1367-L1372 was intended to simplify the code but is actually thrown away since these cases match a later condition: https://github.com/google/xls/blob/052a2a3dfd71ea190a9862355c80346b44203fa8/xls/dslx/stdlib/apfloat.x#L1379-L1380

If the developer could instead have written an early return, they can systematically handle the simpler cases and return, properly ignoring them in later parts of the function.

Current best alternative workaround (limit 100 words)

The refactored code in the example above was turned into a giant if/else if/else:

https://github.com/google/xls/commit/414e6b49ee3dbb81706bb45dc15ae72c62eb31db#diff-215bf4395c04c579d061e991f09dcc1e551d5b7920699eaa91c82898fe1e76cf

Your view of the "best case XLS enhancement" (limit 100 words)

Add the return keyword which can be used in an if block like in Rust: https://doc.rust-lang.org/std/keyword.return.html

ericastor commented 1 month ago

I very much like this idea... but I think it needs some fleshing out, particularly around nested conditional expressions.

For example, what sort of rewrite should apply to this:

...
let x: u2 = if x0 {
  if x1 {
    return u32:256;   
  } else {
    u2: 0
  }
} else {
  u2: 1
};
...
q
cdleary commented 4 weeks ago

Leaving out early returns (historically) was intentional -- we chatted about this largely in lunch conversations and similar. The challenge is that in a software programming language early returns are kind of "strictly good" -- you cut out execution of a bunch of instructions temporally, which for a sequentially executing von Neumann machine, is swell. However, when we're synthesizing hardware, they do not cut out instructions, and instead create (now implicit) muxes between paths that may have very different delays. If you have convergent control flow in a single expression, it's clear that they all "run to completion" in the same result expression, so the fact you have to keep balanced control that ends mostly at a single point may be helpful to indicate the delay of a given computation.

This is one of the things IMO where "C style is not good at making you think about the hardware tradeoffs".

CC @grebe as we had discussed this.

proppy commented 1 week ago

However, when we're synthesizing hardware, they do not cut out instructions, and instead create (now implicit) muxes between paths that may have very different delays.

nit: wouldn't clock (or power) gating #1360 be more or less an hardware equivalent of "cutting down executing" of part of the circuit temporally?

so the fact you have to keep balanced control that ends mostly at a single point may be helpful to indicate the delay of a given computation.

I also think it helps to visually measure the "complexity" of a given computation, and give incentive to split it in smaller functions/circuits (where-as early return kind-of allow you to keep stuffing things in the same compute block).

mikex-oss commented 1 week ago

Discussed this over another lunch conversation with @grebe and @hongted a short while back.

It seems like if you treat early returns as an implicit "else" around the rest of the function body, then early returns would actually do the intuitive thing as far as hardware goes. The muxes actually are inserted in reverse from the output end with the first early return value fed directly to this last mux.

if foo:
  return RV0

blah blah

if bar:
  return RV1

more blah

if baz { RV2 } else { RV3 }
RV0 -----------------------------> |     |
RV1 ----------------> |     |      | mux |  ---> RV
RV2 ---> |     |      | mux | ---> |     |
         | mux | ---> |     |         |
RV3 ---> |     |         |           foo
            |           bar
           baz

On the other hand, in the current state without early returns, you may end up doing the assign and mutate pattern as shown in the initial comment above. For example:

let rv = RV3;

let rv = if foo { RV0 } else { rv };

blah blah

let rv = if bar { RV1 } else { rv };

more blah

if baz { RV2 } else { rv }
RV2 -----------------------------> |     |
RV1 ----------------> |     |      | mux |  ---> RV
RV0 ---> |     |      | mux | ---> |     |
         | mux | ---> |     |         |
RV3 ---> |     |         |           baz
            |           bar
           foo

This isn't any better from a hardware perspective; in fact, the "short" path the user has in their mind (RV0) actually needs to propagate through 3 muxes. It's also just harder from a readability perspective since the code doesn't follow single-assignment form.

proppy commented 1 week ago

On the other hand, in the current state without early returns, you may end up doing the assign and mutate pattern as shown in the initial comment above. For example:

Is this example equivalent to the one w/ early returns? if foo && bar && baz are true the first example is RV0 while the later is RV2.

It's also just harder from a readability perspective since the code doesn't follow single-assignment form.

What about nested form?

if foo {
  RV0
} else 
  // blah blah
  if bar {
    RV1
  } else 
  //  more blah
  if baz { RV2 } else { RV3 }
}

I personally find it "more-readable" than the second example w/ the dead else blocks and I its cascading shape somewhat map visually to the underlying hardware:

RV0 -----------------------------> |     |
RV1 ----------------> |     |      | mux |  ---> RV
RV2 ---> |     |      | mux | ---> |     |
         | mux | ---> |     |         |
RV3 ---> |     |         |           foo
            |           bar
           baz
mikex-oss commented 1 week ago

Is this example equivalent to the one w/ early returns?

@proppy you're right! (I'll leave that error there rather than editing it though.) While not intentional, it does sorta emphasize my point. As in the original comment, it's relatively easy to reason incorrectly about the return value since you need to visually scan backward through all the conditionals. To get the same behavior, I should have written the code bottom up.

As for your 3rd case, fair, though we've talked about deep nesting and whether that's a good thing in Rust-like languages. ("blah blah" and "more blah" were included as a minimal attempt to keep this from being trivial).