zhong-j-yu / rekex

PEG parser generator for Java 17 - grammar as algebraic datatypes
Apache License 2.0
59 stars 6 forks source link

Return type of ctor must be equal to datatype #4

Closed zhong-j-yu closed 3 years ago

zhong-j-yu commented 3 years ago

Currently, we allow return type of a ctor to be a subtype of the target type. Therefore

public A1 a1(...){...}

is a ctor for type A, given A1 <: A.

This makes sense from Java's point of view, where it's always safe to make the return type a more specific type. But it is problematic for mapping between grammar rules and ctors. Typically, a complex grammar contains rules like

A = A1 | A2
...
A1 = ...
...
A2 = ...

It's not easy to review a ctor catalog to find all ctors for A and confirm that they are in the correct order.

If the user explicitly declare ctors Ai->A

public A a1(A1 a1){ ... }

we are in trouble of how to handle them together with ctors returning subtypes.

We should take a simpler approach, which is easier to map grammar rules, easier to reason about, which also gives us more flexibility --

The return type of ctors for A must be A exactly. It is easy to see all ctors for A in declaration order. If no such ctors are found, add implicit ctors Ai->A for direct subtypes as if they are

public A a1(A1 a1){ return a1; }

the order of these ctors is the order of subtypes which is also easy to see. This order is stable, and ctors for subtypes can be arranged in any order.

The user may have good reasons to explicitly declare Ai->A -- to limit subtypes, to order them differently, to make ctors resemble grammar rules more closely, to do some transformation or test semantic predicate in method body.

public A a1(A1 a1){ ... }
public A a2(A2 a2){ ... }
// or: public A a(Alt2<A1,A2> alt){...}

public A1 a1(...){...}
public A2 a2(...){...}