bytecodealliance / wasmtime

A fast and secure runtime for WebAssembly
https://wasmtime.dev/
Apache License 2.0
15.21k stars 1.28k forks source link

ISLE: allow declaring internal constructors as infallible #3546

Open fitzgen opened 2 years ago

fitzgen commented 2 years ago

In https://github.com/bytecodealliance/wasmtime/pull/3545, we added this internal constructor:

(decl vector_size (Type) VectorSize)
(rule (vector_size (multi_lane 8 16)) (VectorSize.Size8x16))
(rule (vector_size (multi_lane 16 8)) (VectorSize.Size16x8))
(rule (vector_size (multi_lane 32 4)) (VectorSize.Size32x4))
(rule (vector_size (multi_lane 64 2)) (VectorSize.Size64x2))

it generates this code:

// Generated as internal constructor for term vector_size.
pub fn constructor_vector_size<C: Context>(ctx: &mut C, arg0: Type) -> Option<VectorSize> {
    let pattern0_0 = arg0;
    if let Some((pattern1_0, pattern1_1)) = C::multi_lane(ctx, pattern0_0) {
        if pattern1_0 == 8 {
            if pattern1_1 == 16 {
                // Rule at src/isa/aarch64/inst.isle line 952.
                let expr0_0 = VectorSize::Size8x16;
                return Some(expr0_0);
            }
        }
        if pattern1_0 == 16 {
            if pattern1_1 == 8 {
                // Rule at src/isa/aarch64/inst.isle line 953.
                let expr0_0 = VectorSize::Size16x8;
                return Some(expr0_0);
            }
        }
        if pattern1_0 == 32 {
            if pattern1_1 == 4 {
                // Rule at src/isa/aarch64/inst.isle line 954.
                let expr0_0 = VectorSize::Size32x4;
                return Some(expr0_0);
            }
        }
        if pattern1_0 == 64 {
            if pattern1_1 == 2 {
                // Rule at src/isa/aarch64/inst.isle line 955.
                let expr0_0 = VectorSize::Size64x2;
                return Some(expr0_0);
            }
        }
    }
    return None;
}

It handles all cases, but neither the Rust compiler nor the ISLE compiler can really tell that. So it would be nice if we could mark it infallible in the ISLE source somehow, and then have an unreachable!() at the end of the generated function, instead of return None.

Please commence with syntax bike shedding suggestions :)

alexcrichton commented 2 years ago

Possible alternative idea:

aka we'd do:

(decl vector_size (Type) VectorSize)
(rule (vector_size (multi_lane 8 16)) (VectorSize.Size8x16))
(rule (vector_size (multi_lane 16 8)) (VectorSize.Size16x8))
(rule (vector_size (multi_lane 32 4)) (VectorSize.Size32x4))
(rule (vector_size (multi_lane 64 2)) (VectorSize.Size64x2))
(rule (vector_size _ty) (compile_error "unknown vector size"))

and this would infer that the vector_size constructor is infallible due to the last case, and the last case would be a manually injected panic with a nice message.

github-actions[bot] commented 2 years ago

Subscribe to Label Action

cc @cfallin, @fitzgen

This issue or pull request has been labeled: "isle" Thus the following users have been cc'd because of the following labels: * cfallin: isle * fitzgen: isle To subscribe or unsubscribe from this label, edit the .github/subscribe-to-label.json configuration file. [Learn more.](https://github.com/bytecodealliance/subscribe-to-label-action)
cfallin commented 2 years ago

I'd like to lightly veto the "add the ability to automatically infer infallible" direction: in the limit this implies SAT-solving, and significantly complexifies things if we want to be able to see through invoked internal constructors as well. (Basically, inline all the way down, then flatten all conditions, then encode to SAT, then see if "fallthrough" condition is UNSAT.)

I like the unreachable!()-at-the-end approach in contrast: very lightweight change, and our fuzzers should hit it if we get it wrong...

alexcrichton commented 2 years ago

To clarify though I wouldn't propose any sort of sat-solving or "did you check all the cases" checking, instead just simply "are there no constraints on any arguments in one rule" as a "is this exhaustive" check, which I think would be easy to implement and only "surprising" if you had a rule-per-case and expected it otherwise to be exhaustive

fitzgen commented 2 years ago

First add the ability for ISLE to automatically infer that constructors are infallible (e.g. if the function would end with return Some(...) then remove the Option)

I view this as orthogonal from this issue which focuses on cases where even if we had that exhaustive-match checking the ISLE compiler wouldn't be able to tell this is infallible.

fitzgen commented 2 years ago

So it would be nice if we could mark it infallible in the ISLE source somehow, and then have an unreachable!() at the end of the generated function, instead of return None.

(Also, in case it isn't obvious, the generated function would return T instead of Option<T> as well)

fitzgen commented 2 years ago

Second add the ability to define a rule that yields an error.

Are you imagining a compile time error? This would make the exhaustive-match checking part of the semantics of the language, rather than an internal optimization detail, FWIW.

Alternatively, we can do this with runtime errors right now, by having an extern constructor that just panics (although we can't pass a string argument through, I'd like to add that ability tho).

alexcrichton commented 2 years ago

Hm sorry so let me try to clarify. Right now ISLE generates constructors always returning Option<T>. I think that's somewhat of a minor bug because there are functions where it ends with return Some(val) and nothing actually returns None. For compile-time-performance-reasons primarily I think that ISLE, independent of any other changes, should be able to infer this situation. When the function never returns None then it should return T, not Option<T>.

Assuming that that world exists, then there's no need to manually mark anything as infallible, it's simply already infallible by construction (nothing returns None). There are still esoteric cases where if you do something like match half the state space in one rule and the other half in another rule then ISLE doesn't know about that, and ISLE still thinks that the function is fallible (because there's a fall-through that returns None and ISLE doesn't realize the previous two rules cover the whole state space, only we as humans understand that).

So what I would imagine is a second addition to ISLE, after auto-inference of infallibility, which is to have some sort of (compile_error "foo"). This means that in the situation that I'd like to assert that the whole state space is covered by my two rule annotations I'd add a third catch-all with the compile error explaining why this shouldn't happen.

I view this as orthogonal from this issue which focuses on cases where even if we had that exhaustive-match checking the ISLE compiler wouldn't be able to tell this is infallible.

The reason I don't believe that these are orthogonal is that if we assume that ISLE, for performance reasons, will infer infallibilty for other normal rules then there's no need for a new "panic here" feature to ISLE to also reason about fallibility. For example instead of being able to say "this constructor is infallible and here's the panic message if I'm wrong" I think it'd be better to say "if you hit this rule then panic with this message".

Are you imagining a compile time error?

No, I'm not imagining a Rust-like exhaustiveness check. Plain-and-simple "does the function end with return Some(..)? If so the function is infallible, if not the function is fallible, nothing fancier.

fitzgen commented 2 years ago

Assuming that that world exists, then there's no need to manually mark anything as infallible,

But the original example is something that cannot be caught by simple exhaustiveness checking.

For compile-time-performance-reasons primarily I think that ISLE, independent of any other changes, should be able to infer this situation. When the function never returns None then it should return T, not Option<T>.

I agree that we should do this, and this is what I was saying was an orthogonal compilation that the ISLE compiler can perform.

So what I would imagine is a second addition to ISLE, after auto-inference of infallibility, which is to have some sort of (compile_error "foo"). This means that in the situation that I'd like to assert that the whole state space is covered by my two rule annotations I'd add a third catch-all with the compile error explaining why this shouldn't happen.

I think compile_error is misleading because the ISLE compiler won't bail out with an error if it can't prove exhaustiveness here, it will just insert _ => unreachable!("foo").

This is equivalent to what I was suggesting in the OP. We can either do

(decl foo (A B) C)
;; ...
(rule (foo _a _b) (unreachable "I covered the whole state space")

or

(decl foo infallible (A B) C)
;; ...

but they should both mean that we get unreachable!() at the end instead of return None.

And then separately from that, the other do-we-ever-return-None? optimization can turn the signature from Option<T> to T.

cfallin commented 2 years ago

So I thought a bit about this just now:

should be able to infer this situation. When the function never returns None then it should return T, not Option.

and I have to say that I think it implies a level of compiler complexity that we probably don't want to take on, for the basic reason that it implies a global (all-rules) inference with magic action-at-a-distance behavior, and a significant ISLE compiler rearchitecting to make that possible.

The basic issue is that this implies (i) we change the signature of a ctor based on its content (all cases covered --> T rather than Option<T>), and so (ii) we change the way we call one ctor from another ctor based on the other ctor's content, and so (iii) the body of ctor X depends on the body of ctor Y.

(Said another way, one rule can invoke another, and so one rule's infallibility depends on all invoked rules' infallibility.)

This is tractable if we disallow cycles; topo-sort and compile bottom-up, pinning down signatures as we go. But while we've discussed stratification / disallowing cycles, we currently don't, and IMHO the ability to (co)recurse might actually be useful at some point. But if we allow cycles then we have a global fixpoint analysis for infallibility!

This feels very similar to analogous global-type-inference issues in other languages; I'd prefer not to reinvent half of Hindley-Milner here :-)

I think probably the right approach is what @fitzgen suggests above, with an explicit infallible keyword in the decl; this is the moral equivalent of (edit: actually just literally) requiring explicit type signatures at the top level to allow separate function compilation.

Thoughts?

bjorn3 commented 2 years ago

We could bail out on cycles. That should still catch the vast majority of the cases.

cfallin commented 2 years ago

@bjorn3 that seems to me not to solve the fundamental issue, which is the complexity and action-at-a-distance, and adds a hybrid sometimes-available-but-there's-a-perf-cliff-to-avoid flavor of problem as well. I'd prefer to have signatures nailed down explicitly by the decls, with something like Nick's proposed syntax.

alexcrichton commented 2 years ago

I'm happy to defer to y'all and I don't feel too strongly about this. I don't get the impression my idea is fully understood, but it's not really that important anyway.

cfallin commented 2 years ago

@alexcrichton if there's a way to do it that we're missing, I'm still all-ears! I definitely understand the "if the function body never returns None then we can return T" idea, locally to codegen for the function; the issue as I understand it is just at callsites to this function, where we have to know whether it returns Option<T> or T without examining its body. Anyway, sorry if we're missing a solution here and happy to discuss more if desired :-)