runtimeverification / k

K Framework Tools 7.0
BSD 3-Clause "New" or "Revised" License
443 stars 145 forks source link

Disallow `syntax {S} S ::= ...` #3839

Open Scott-Guest opened 10 months ago

Scott-Guest commented 10 months ago

Productions syntax {S} S ::= ... where the sort parameter S only appears on the LHS are buggy, and appear to be de-facto unsupported.

At a minimum, they violate restrictions that we place on hooked sorts. For example,

syntax Bool ::= foo()
syntax Bool ::= MyBoolSubsort

is disallowed because you cannot add new constructors or subsorts to hooked sorts, thus

syntax {S} S ::= foo()
syntax {S} S ::= MySubsort

must be disallowed as well. (We don't currently report a nice error here, and instead report unrelated errors or crash depending on the exact example).

We could potentially permit syntax {S} S ::= "new_token" [token] or syntax {S} S ::= foo() [function] as hooked sorts allow token and function productions.

However, I’m unaware of any compelling use cases for this, and they are either unsupported or buggy in practice.

For example,

syntax {S} S ::= "bott1" [token]
               | "bott2" [token]
rule bott1 => bott2

results in

[Error] Internal: Uncaught exception thrown of type NoSuchElementException
(NoSuchElementException: key not found: SortS)

Given these issues, I propose that we just report an error in the frontend on any such production.

Baltoli commented 10 months ago

I think the check probably needs to exclude productions that are hooked (see https://github.com/runtimeverification/k/blob/master/k-distribution/include/kframework/builtin/domains.md#meta-operations):

syntax {Sort} Sort ::= #parseKORE(String) [function, hook(KREFLECTION.parseKORE)]
Baltoli commented 10 months ago

(although, that production seems to be the only one in domains.md that has the sort parameter only on the LHS, so there might be something we can do here - especially given how much of a nasty escape hatch #parseKORE is)

Scott-Guest commented 10 months ago

I see two options regarding #parseKORE:

I personally am in favor of changing the signature. This is a breaking change, but likely a small one given that #parseKORE is not widely used.

By doing so, we ensure that

Baltoli commented 10 months ago

Yep, on reflection I think let's change the interface of parseKORE and require projections. It should require some thought if you want to break out of the type system like that!

Worth also auditing the more internal parts of the standard library to make sure we haven't missed any other examples, but then if that's all good then let's go ahead and implement the frontend restriction.

Scott-Guest commented 10 months ago

It appears that a number of the matching logic symbols also have parametric sorts only on their LHS.

I don't understand these symbols well-enough to know how to handle this. Let's discuss in the meeting on Tuesday.

Baltoli commented 10 months ago

An attempt at a list of symbols with LHS-only parametric sorts:

Without spending too much time working through it, we might well be able to figure out a solution with the rough structure:

Baltoli commented 10 months ago

@Scott-Guest and @radumereuta discuss KBott

Otherwise special-case for ML symbols; otherwise disallow LHS polymorphism

Baltoli commented 10 months ago

@Baltoli @dwightguth write up the issue with #parseKORE not working in sort K.

Scott-Guest commented 10 months ago

If we can't add new constructors to hooked sorts, we shouldn't allow syntax {S} S ::= ... where the RHS is a constructor at all (regardless of whether S appears on the RHS or not).

Between kast.md and domains.md, it looks like we currently use parametric constructors for

If I understand correctly, the reason we don't allow adding new constructors to hooked sorts is that they can cause the hooks to crash at runtime. Thus, these constructor are only okay in so far as we guarantee they will be evaluated away before being passed to a hook.

A few questions then

Scott-Guest commented 9 months ago

Per @dwightguth:

#location is relatively easy to fix because it is enabled via an attribute applied to specific sorts so you should be able to autogenerate a non parametric symbol with a specific name for each such sort and then emit the corresponding one in the Bison code

amb will be harder because there is nothing theoretically preventing an ambiguity from occurring over a sort like Bool that would cause a crash, we just simply don't guarantee that everything will actually work correctly if such occurs

Scott-Guest commented 9 months ago

Given that this will require fairly involved fixes for #location and #amb, I'll implement the check but leave them as special-cases for now.

In the long run, we should try to address them though.

Scott-Guest commented 9 months ago

The current plan is to disallow syntax {S} S ::= ... declarations except for

As a result, every constructor's sort parameters only appear on the RHS. However, any sort parameter that only appears on the RHS can without loss of generality be realized as K, e.g.,

There is a slight difference in that, given Sub <: Super and subVal() : Sub,

In my opinion, this is good and desirable! The fact that foo{SubSort} and foo{SuperSort} are distinct is confusing and unintuitive given K's pervasive subtyping, e.g., it's surprising that a rule like

syntax Sub ::= SubVal()

syntax Super ::= Sub
               | SuperVal()

syntax {S} S ::= identity(S)
rule identity(Z:Super) => Z

does not apply to the term identity(SubVal()) as we discovered in #3833. If you do really need them to be distinct, then it's clearer to just manually implement every monomorphization as distinct constructors

syntax Super ::= identitySuper(Super)
syntax Sub ::= identitySub(Sub)

With this change then, we can restrict the syntax so that (for constructors) sort variables only occur as sort parameters, e.g.,

syntax Foo{S} ::= foo(S, S)

Coincidentally, this restriction is also a conceptual restriction shared by the haskell backend!

dwight.guth 3 years ago @tom.tuegel how complicated would it be to allow the Haskell backend to support parametric constructors? I seem to recall hearing you say something in the past that indicated you didn't want to go this route, but I wanted to bring it up because there are several cases during parsing where a parametric constructor substantially simplifies the implementation, and since the LLVM backend already supports this, I wanted to sound you out

tom.tuegel 3 years ago It should not be a significant difficulty to support parametric constructors, with some restrictions. As long as there is a definite sort constructor with the same parameters as the constructor symbol, it should be fine, that is: symbol Ctor{R, S, ...}(...) : Sort{R, S, ...} [constructor] Declarations like the following are syntactically legal, but logically problematic: symbol Any{S}(...) : S As long as we restrict ourselves to the first form, I think it should be alright.

To me, this a good indication that banning syntax {S} S ::= ... is morally the right choice.

Baltoli commented 9 months ago

That is very neat, and especially pleasing that it subsumes that LLVM backend issue (I hadn't quite clocked the link there during the meeting). Nice writeup - thanks @Scott-Guest