Closed olynch closed 1 week ago
@olynch I've just committed the work (unfinished) I started to work on renaming. You're right it's complex; it took me some time to develop a better understanding of the design after our discussion, so the work here is somewhat minimal.
As part to learning the design, I started a separate _usetheory
function to build my understanding from the ground-up.
As discussed in private conversation, we want to strip out all of the renaming support here, because we have decided that renaming is tricky to get right. So this should hopefully end up being a pretty simple PR.
The following is a summary of the work done so far on multiple-inheritance. While a @rename macro solution was proposed to close out the branch, I'm requesting help getting the using-as feature of multiple-inheritance over the finish line.
In March, @olynch and I discussed stripping out the renaming in favor of a @rename macro. That is implemented
@rename ThMod0 begin using ThModule: M as Form0 end
but after a ~three-month break, I once again took another stab at using default renaming with some success in defining ThModule through using statements. After some tweaking from catching the repo up with main, the following is a working theory in GATlab and has examples in src/stdlib/models/twosorted.jl
:
@theory ThModule begin
using ThAb: default as M, ⋅ as ⊕
using ThRing: default as Scalar, one as unit
(r ⋅ a) :: M ⊣ [r::Scalar, a::M] # R-actions
(r ⋅ (a ⊕ b)) == ((r ⋅ a) ⊕ (r ⋅ b)) ⊣ [r::Scalar, a::M, b::M] # R-action left-distributes
((r + s) ⋅ a) == ((r ⋅ a) ⊕ (s ⋅ a)) ⊣ [r::Scalar, s::Scalar, a::M] # addition of R-actions
(r * s) ⋅ a == r ⋅ (s ⋅ a) ⊣ [r::Scalar, s::Scalar, a::M] # composition of R-action
unit ⋅ a == a ⊣ [unit::Scalar, a::M] # unit
(a ⋅ r) :: M ⊣ [r::Scalar, a::M] # R-actions
((a ⊕ b) ⋅ r) == ((a ⋅ r) ⊕ (b ⋅ r)) ⊣ [r::Scalar, a::M, b::M] # R-action right-distributes
(a ⋅ (r + s)) == ((a ⋅ r) ⊕ (a ⋅ s)) ⊣ [r::Scalar, s::Scalar, a::M] # addition of R-actions
a ⋅ (r * s) == (a ⋅ r) ⋅ s ⊣ [r::Scalar, s::Scalar, a::M] # composition of R-action
a ⋅ unit == a ⊣ [unit::Scalar, a::M] # unit
end
I'm happy with the progress made to implement multiple-inheritance, but naturally over the course of learning the GATlab architecture, I likely made several suboptimal kludgey design choices, whereas I'd be more satisfied with an elegant solution. However because I'm somewhat invested in the design, and have been working rather too close to it, I want to take this opportunity to step back and allow others to comment on it. Here, I'll explain the procedure for implementing using-as
statements in theories here.
1) Rather than using parse_gat_line!
to handle using
statements, I chose defining several functions used by theory_impl
which intercept using statements in the theory body in order to expand their theories, and if applicable, collect renamings.
This works in the following order:
1.1) theory_impl
calls expand_theory
on the body.
1.2) expand_theory
loops through the body matching using statements (vanilla or using-as, a.k.a renaming). For every match, the using-theory is expanding and passed to usetheory!
1.3) usetheory!
accepts a "host" theory (the parent), a "guest" theory (the new one), and an optional dictionary of renames.
1.3.1) If there is a rename dictionary, we generate a new dictionary of instructions for the reident
function through makeidentdict
. There are two parts to this function:
1.3.1.1) Reident gives the "host" and "guest" theories a "nametag," a dictionary associating their names with their tags. As the dead code in nametag
suggests, I previously used namelookup and taglookup, but these don't update until later in theory_impl
, so the second using-as statement would still recognize the host as "EMPTY", as it would not yet have a namelookup.
(PROBLEM: I was unsure how to populate the LID, and used a placeholder LID(1)
).
1.3.1.2) A new dictionary is created associating each old ident to a new one.
1.3.2) the reident
uses the result of the makeidentdict
on the guest so it does not introduce conflicting idents.
2) Borrowing @kris-brown's design, I used a reident
function to recurse through the GAT replacing both names and tags at once. Because there are 30 methods of reident, I'm tempted to suggest reident should be a big ML-style lambda, so that (re)defining and diagnosing reidentification issues are defined in one definition.
3) Because there an arbitrary number of new segments, I loop over each segment to produce the juliadeclarations, new names, etc. This is an issue when we
push!(modulelines, :($(GlobalRef(TheoryInterface, :GAT_MODULE_LOOKUP))[$(gettag(newsegment))] = $name))
as there is no newsegment
.
ISSUES (non-exhaustive)
1) Because of my design, statements with the same name but different sorts are considered collisions and therefore omitted. As you can see in stdlib/dec.jl
, defining Forms from three using-as ThModule lines
@theory _CoChains begin
using ThModule: M as Form0
using ThModule: M as Form1
using ThModule: M as Form2
end
causes the module axioms over Form1 (and Form2) to be omitted, as the segment itself has the same tag. For the sake of demonstrating the problem further, I've defined three theories:
@rename ThMod0 begin using ThModule: M as Form0 end
@rename ThMod1 begin using ThModule: M as Form1 end
@rename ThMod2 begin using ThModule: M as Form2 end
We can see that the type declarations are different, but the segments defining \oplus have the same tag:
# The Form0 and Form1 segments
ThM0.Meta.theory.segments.scopes[1]
ThM1.Meta.theory.segments.scopes[1]
# The \oplus declaration with a constructor
ThM0.Meta.theory.segments.scopes[2]
ThM1.Meta.theory.segments.scopes[2]
#
ThM0.Meta.theory.segments.scopes[2].scope.tag
ThM1.Meta.theory.segments.scopes[2].scope.tag
My impression is that this problem could be fixed in the current design if reident could reidentify a segment if it contains bindings which contain a term that will be renamed, e.g.,
ThM1.Meta.theory.segments.scopes[2].scope.bindings
contains a Form1
and therefore should be reidented, as it is distinct from the same \oplus on Form0
. Then I wonder if the concept of a "nametag" should be just be expanded to every segment of the theory, not just "names."
2) A test will failed in SymbolicModels because Category.Ob must be imported. I reckon this is also attributed to some kludgery I've done.
Note: Not every theory has been rewritten for using-as statements. I decided not to do this in favor of just fixing the underlying problem.
EDIT After some thought, I think for the time being a more sophisticated dictionary for renaming would be the quickest fix.
For example, we can reident M
as Form0
and Form1
. However this would not reident the segment where \oplus
is declared; ThMod0
and ThMod1
both define \oplus as dependent on Form0
and Form1
respectively, but the segment itself remains the same. More sophisticated instructions passed to reident
would solve this immediate problem of "multiple inheritance in GATs," however axioms like (just for sake of example)
X \oplus (Y \oplus Z) == (X \oplus Y) \oplus Z \dashv [X::Form0, Y::Form1, Z::Form2]
would probably require more declarations of form-form interaction, e.g.
X \oplus Y :: Form1 \dashv [X::Form0, Y::Form1]
This particular example treads out of scope into graded theories, however.
There are actually two uses for renaming: renaming operations and renaming types.
Of the two, I think that renaming operations is "more destructive". For instance, we would use renaming operations to make ThAdditiveGroup
and ThMultiplicativeGroup
, which would be separate theories. This would be somewhat explicit/manual, and perhaps we wouldn't even have a renaming operation and instead just manually copy the algebraic hierarchy in additive and multiplicative notation. This is what lean does, it's bad but not that bad, and given that whenever we rename an operation we create a whole new Julia module and new functions, this probably should be something that has a bit of friction so that people don't go crazy with it.
However, for renaming types, we might want to take a slightly more gentle approach, inspired by trait systems in Rust/Haskell/Scala.
Specifically, we should have parameterized theories like
@theory ThGroup{T} begin
using ThMonoid{T}
inv(a::T)::T
# axioms...
end
Then the DEC can be written as
@theory ThDEC begin
Scalar::TYPE
Form0::TYPE
Form1::TYPE
Form2::TYPE
using ThRing{Scalar}
using ThAlgebra{Scalar, Form0}
using ThModule{Form0, Form1}
using ThModule{Form0, Form2}
end
The key thing is that these using
statements don't do any renaming; they just declare overloadings of the relevant operators in ThRing
/ThAlgebra
/ThModule
and they don't create any new Julia modules.
The one thing that we'd have to worry about is the methods that check for validity. I think that when we make a parameterized module, we simply don't create the methods for the types that are in the parameters. Then ThDEC
has validity checkers for Scalar, Form0, Form1, Form2
, but ThRing
doesn't.
Attention: Patch coverage is 84.47205%
with 25 lines
in your changes missing coverage. Please review.
Project coverage is 93.85%. Comparing base (
d56249e
) to head (57888cd
).
Files | Patch % | Lines |
---|---|---|
src/syntax/gats/gat.jl | 83.01% | 9 Missing :warning: |
src/syntax/Scopes.jl | 83.33% | 6 Missing :warning: |
src/syntax/gats/judgments.jl | 57.14% | 6 Missing :warning: |
src/stdlib/models/arithmetic.jl | 66.66% | 3 Missing :warning: |
src/syntax/gats/ast.jl | 93.33% | 1 Missing :warning: |
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
This looks awesome. Long have I waited for such features :)