AlgebraicJulia / GATlab.jl

GATlab: a computer algebra system based on generalized algebraic theories (GATs)
https://algebraicjulia.github.io/GATlab.jl/
MIT License
21 stars 2 forks source link

Multiple inheritance of GATs #147

Closed olynch closed 1 week ago

epatters commented 3 months ago

This looks awesome. Long have I waited for such features :)

quffaro commented 3 months ago

@olynch I've just committed the work (unfinished) I started to work on renaming. You're right it's complex; it took me some time to develop a better understanding of the design after our discussion, so the work here is somewhat minimal.

As part to learning the design, I started a separate _usetheory function to build my understanding from the ground-up.

olynch commented 3 months ago

As discussed in private conversation, we want to strip out all of the renaming support here, because we have decided that renaming is tricky to get right. So this should hopefully end up being a pretty simple PR.

quffaro commented 1 month ago

The following is a summary of the work done so far on multiple-inheritance. While a @rename macro solution was proposed to close out the branch, I'm requesting help getting the using-as feature of multiple-inheritance over the finish line.

In March, @olynch and I discussed stripping out the renaming in favor of a @rename macro. That is implemented

@rename ThMod0 begin using ThModule: M as Form0 end

but after a ~three-month break, I once again took another stab at using default renaming with some success in defining ThModule through using statements. After some tweaking from catching the repo up with main, the following is a working theory in GATlab and has examples in src/stdlib/models/twosorted.jl:

@theory ThModule begin
  using ThAb: default as M, ⋅ as ⊕
  using ThRing: default as Scalar, one as unit

  (r ⋅ a) :: M ⊣ [r::Scalar, a::M]                                    # R-actions
  (r ⋅ (a ⊕ b)) == ((r ⋅ a) ⊕ (r ⋅ b)) ⊣ [r::Scalar, a::M, b::M]      # R-action left-distributes
  ((r + s) ⋅ a) == ((r ⋅ a) ⊕ (s ⋅ a)) ⊣ [r::Scalar, s::Scalar, a::M] # addition of R-actions
  (r * s) ⋅ a == r ⋅ (s ⋅ a) ⊣ [r::Scalar, s::Scalar, a::M]           # composition of R-action
  unit ⋅ a == a ⊣ [unit::Scalar, a::M]                                # unit

  (a ⋅ r) :: M ⊣ [r::Scalar, a::M]                                    # R-actions
  ((a ⊕ b) ⋅ r) == ((a ⋅ r) ⊕ (b ⋅ r)) ⊣ [r::Scalar, a::M, b::M]      # R-action right-distributes
  (a ⋅ (r + s)) == ((a ⋅ r) ⊕ (a ⋅ s)) ⊣ [r::Scalar, s::Scalar, a::M] # addition of R-actions
  a ⋅ (r * s) == (a ⋅ r) ⋅ s ⊣ [r::Scalar, s::Scalar, a::M]           # composition of R-action
  a ⋅ unit == a ⊣ [unit::Scalar, a::M]                                # unit
end

I'm happy with the progress made to implement multiple-inheritance, but naturally over the course of learning the GATlab architecture, I likely made several suboptimal kludgey design choices, whereas I'd be more satisfied with an elegant solution. However because I'm somewhat invested in the design, and have been working rather too close to it, I want to take this opportunity to step back and allow others to comment on it. Here, I'll explain the procedure for implementing using-as statements in theories here.

1) Rather than using parse_gat_line! to handle using statements, I chose defining several functions used by theory_impl which intercept using statements in the theory body in order to expand their theories, and if applicable, collect renamings.

This works in the following order:

1.1) theory_impl calls expand_theory on the body.

1.2) expand_theory loops through the body matching using statements (vanilla or using-as, a.k.a renaming). For every match, the using-theory is expanding and passed to usetheory!

1.3) usetheory! accepts a "host" theory (the parent), a "guest" theory (the new one), and an optional dictionary of renames.

1.3.1) If there is a rename dictionary, we generate a new dictionary of instructions for the reident function through makeidentdict. There are two parts to this function:

1.3.1.1) Reident gives the "host" and "guest" theories a "nametag," a dictionary associating their names with their tags. As the dead code in nametag suggests, I previously used namelookup and taglookup, but these don't update until later in theory_impl, so the second using-as statement would still recognize the host as "EMPTY", as it would not yet have a namelookup.

(PROBLEM: I was unsure how to populate the LID, and used a placeholder LID(1)).

1.3.1.2) A new dictionary is created associating each old ident to a new one.

1.3.2) the reident uses the result of the makeidentdict on the guest so it does not introduce conflicting idents.

2) Borrowing @kris-brown's design, I used a reident function to recurse through the GAT replacing both names and tags at once. Because there are 30 methods of reident, I'm tempted to suggest reident should be a big ML-style lambda, so that (re)defining and diagnosing reidentification issues are defined in one definition.

3) Because there an arbitrary number of new segments, I loop over each segment to produce the juliadeclarations, new names, etc. This is an issue when we

push!(modulelines, :($(GlobalRef(TheoryInterface, :GAT_MODULE_LOOKUP))[$(gettag(newsegment))] = $name))

as there is no newsegment.

ISSUES (non-exhaustive)

1) Because of my design, statements with the same name but different sorts are considered collisions and therefore omitted. As you can see in stdlib/dec.jl, defining Forms from three using-as ThModule lines

@theory _CoChains begin
  using ThModule: M as Form0
  using ThModule: M as Form1
  using ThModule: M as Form2
end

causes the module axioms over Form1 (and Form2) to be omitted, as the segment itself has the same tag. For the sake of demonstrating the problem further, I've defined three theories:

@rename ThMod0 begin using ThModule: M as Form0 end
@rename ThMod1 begin using ThModule: M as Form1 end
@rename ThMod2 begin using ThModule: M as Form2 end

We can see that the type declarations are different, but the segments defining \oplus have the same tag:

# The Form0 and Form1 segments
ThM0.Meta.theory.segments.scopes[1]
ThM1.Meta.theory.segments.scopes[1]
# The \oplus declaration with a constructor
ThM0.Meta.theory.segments.scopes[2]
ThM1.Meta.theory.segments.scopes[2]
#
ThM0.Meta.theory.segments.scopes[2].scope.tag
ThM1.Meta.theory.segments.scopes[2].scope.tag

My impression is that this problem could be fixed in the current design if reident could reidentify a segment if it contains bindings which contain a term that will be renamed, e.g.,

ThM1.Meta.theory.segments.scopes[2].scope.bindings

contains a Form1 and therefore should be reidented, as it is distinct from the same \oplus on Form0. Then I wonder if the concept of a "nametag" should be just be expanded to every segment of the theory, not just "names."

2) A test will failed in SymbolicModels because Category.Ob must be imported. I reckon this is also attributed to some kludgery I've done.

Note: Not every theory has been rewritten for using-as statements. I decided not to do this in favor of just fixing the underlying problem.

EDIT After some thought, I think for the time being a more sophisticated dictionary for renaming would be the quickest fix.

For example, we can reident M as Form0 and Form1. However this would not reident the segment where \oplus is declared; ThMod0 and ThMod1 both define \oplus as dependent on Form0 and Form1 respectively, but the segment itself remains the same. More sophisticated instructions passed to reident would solve this immediate problem of "multiple inheritance in GATs," however axioms like (just for sake of example)

X \oplus (Y \oplus Z) == (X \oplus Y) \oplus Z \dashv [X::Form0, Y::Form1, Z::Form2]

would probably require more declarations of form-form interaction, e.g.

X \oplus Y :: Form1 \dashv [X::Form0, Y::Form1]

This particular example treads out of scope into graded theories, however.

olynch commented 1 month ago

There are actually two uses for renaming: renaming operations and renaming types.

Of the two, I think that renaming operations is "more destructive". For instance, we would use renaming operations to make ThAdditiveGroup and ThMultiplicativeGroup, which would be separate theories. This would be somewhat explicit/manual, and perhaps we wouldn't even have a renaming operation and instead just manually copy the algebraic hierarchy in additive and multiplicative notation. This is what lean does, it's bad but not that bad, and given that whenever we rename an operation we create a whole new Julia module and new functions, this probably should be something that has a bit of friction so that people don't go crazy with it.

However, for renaming types, we might want to take a slightly more gentle approach, inspired by trait systems in Rust/Haskell/Scala.

Specifically, we should have parameterized theories like

@theory ThGroup{T} begin
  using ThMonoid{T}
  inv(a::T)::T
  # axioms...
end

Then the DEC can be written as

@theory ThDEC begin
  Scalar::TYPE
  Form0::TYPE
  Form1::TYPE
  Form2::TYPE
  using ThRing{Scalar}
  using ThAlgebra{Scalar, Form0}
  using ThModule{Form0, Form1}
  using ThModule{Form0, Form2}
end

The key thing is that these using statements don't do any renaming; they just declare overloadings of the relevant operators in ThRing/ThAlgebra/ThModule and they don't create any new Julia modules.

The one thing that we'd have to worry about is the methods that check for validity. I think that when we make a parameterized module, we simply don't create the methods for the types that are in the parameters. Then ThDEC has validity checkers for Scalar, Form0, Form1, Form2, but ThRing doesn't.

codecov[bot] commented 2 weeks ago

Codecov Report

Attention: Patch coverage is 84.47205% with 25 lines in your changes missing coverage. Please review.

Project coverage is 93.85%. Comparing base (d56249e) to head (57888cd).

Files Patch % Lines
src/syntax/gats/gat.jl 83.01% 9 Missing :warning:
src/syntax/Scopes.jl 83.33% 6 Missing :warning:
src/syntax/gats/judgments.jl 57.14% 6 Missing :warning:
src/stdlib/models/arithmetic.jl 66.66% 3 Missing :warning:
src/syntax/gats/ast.jl 93.33% 1 Missing :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #147 +/- ## ========================================== - Coverage 94.52% 93.85% -0.68% ========================================== Files 38 38 Lines 2082 2229 +147 ========================================== + Hits 1968 2092 +124 - Misses 114 137 +23 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.