Alternative `ModelKinds` design + Display Language

balacij commented 3 years ago

Understanding the existing `ModelKinds` design

Taking a look at our existing ModelKinds definition: https://github.com/JacquesCarette/Drasil/blob/0ea1e82231c83ae999b58cfc871d85fbbcaa3ef3/code/drasil-theory/lib/Theory/Drasil/ModelKinds.hs#L33-L38

We have a type variable "e", which we place "Expr" or "ModelExpr" in. I believe this type variable also contributes to the issue that Dr. Carette noted -- this can certainly be confusing, but I think it also indicates that we should re-investigate #2599 and ModelKinds. The goal of this ticket is to create a new design for ModelKinds as a series of constraints.

First, why was this type variable introduced? It was introduced to restrict "InstanceModels" to models containing Exprs (which can be totally converted into GOOL well), and the other *Models to ModelExprs (which can't be totally converted into GOOL without manual intervention).

Problems with this design (of ModelKinds + the type variable)

The type variable isn't always applicable to each model, which results in awkward fits at times.

It does indeed restrict InstanceModels to only models containing Exprs. However, I don't believe that "Expr" should be the requirement that we have to indicate usability in an instance model because not all models use "Exprs", and some won't contain any sort of expression language. DEModels are also usable in InstanceModels, but they're built using ModelExprs (in their RelationConcepts [at the moment]), which is why the smart constructors allow the users to fit them into any ModelExpr type hole: https://github.com/JacquesCarette/Drasil/blob/0ea1e82231c83ae999b58cfc871d85fbbcaa3ef3/code/drasil-theory/lib/Theory/Drasil/ModelKinds.hs#L49-L55

This indicates that the type variable isn't needed (and is irrelevant) for DEModels.

Additionally, looking at all 3 Equational* constructors, they are all forced to contain "Exprs" if they are to be used in InstanceModels, and they are all forced to contain "ModelExprs" if they are to be used in TMs/GDs.
We have implicit, unclear requirements of each ModelKind.

A lot of code is spent on passing through information for used typeclasses. However, the requirements of the ModelKinds isn't quite well documented yet, so it's difficult to say which is required for each model. https://github.com/JacquesCarette/Drasil/blob/0ea1e82231c83ae999b58cfc871d85fbbcaa3ef3/code/drasil-theory/lib/Theory/Drasil/ModelKinds.hs#L105-L116
There is a lot of code duplication.

A lot of the code duplication occurs as a result of the above point.
Adding more "ModelKinds" is difficult.

The existing ModelKinds are built using standard GADTs. Adding a new "ModelKind" requires us to create a new data constructor in drasil-theory/ModelKinds.hs, create a lens that satisfies some constraints so that it may satisfy it's lens and typeclass instances.

An Alternative Design

I've come up with 2 real designs (technically 3, but 2 are essentially the same idea w/ different stylization), with a working prototype on a stripped drasil-lang and drasil-theory on my ExprTests repo.

Design 1 - using an "Abstractness" datatype to indicate whether a model is usable in code generation as an alternative to using Expr

-- | Is a model "concrete" (must be usable in code generation), 
--   or "abstract" (can be, but not needed to, be used in code generation,
--   but must be usable in modelling)?
data Abstractness = Concrete | Abstract

-- | `ModelKinds` is a container for describing allowed "models" inside of TMs/IMs/GDs.
--   
--   There are different types of them. Some may be "concrete" or "abstract",
--   and some would contain Exprs or ModelExprs or other usable types.
data ModelKinds (c :: Abstractness) e where
    EquationalModel :: QDefinition e     -> ModelKinds c e
    -- TODO: EquationalRealm (using ConstraintSet replica)
    -- TODO: EquationalConstraints (using MultiDefn replica)
    DEModel         :: RelationConcept Relation -> ModelKinds Concrete e
    OtherModel      :: RelationConcept e -> ModelKinds Abstract e

-- | Container for "Concrete Model Kinds usable in code generation"
data ConcreteModelKind = forall t. CMK (ModelKinds Concrete (Expr t))
-- | Container for "Abstract Model Kinds"
data AbstractModelKind = forall t. AMK (ModelKinds Abstract (ModelExpr t))

(from: https://github.com/balacij/ExprTests/blob/614b84675be56421ec62be49ca017374ed4ff253/src/Theory/ModelKinds.hs#L52-L80 )

Note that this doesn't stop us from creating "ModelKinds Concrete (ModelExpr t)", which would be inadmissible. This is a rather brittle solution, with an unnecessary "Expr"-kind, and arbitrary "Abstractness" imposition. To make up for allowing inadmissible constructions, we are forced to force requirements in the smart constructors, e.g.:

-- | QDefinitions with Exprs should only be used to create "Concrete Model Kinds"
conEquatModel :: QDefinition (Expr t) -> ConcreteModelKind
conEquatModel = CMK . EquationalModel

-- | QDefinitions with ModelExprs should only be used to create "Abstract Model Kinds"
absEquatModel :: QDefinition (ModelExpr t) -> AbstractModelKind
absEquatModel = AMK . EquationalModel

Design 2 - describing "ModelKinds" as a typeclass with a series of typeclass constraints

We define a "CoreModelKinds" which contains the core functionality required for all *Models.

class (Display e
    ,  HasUID e
    ,  HasShortName e
    ) => CoreModelKinds2' e where

NOTE: There will need to be at least 6 constraints, but we can omit them from this proposal since they're, implementation-wise, the same as "HasUID" or "HasShortName". However, we notable replace the "Express" constraint (which would expect to convert things into ModelExprs) with a "Display" constraint which allows for things to be placed inside a new basic Display-only language (which sits atop Expr and ModelExpr, intended for formatting them next to each other [without changing them at all]).

Then we define an "Instantiable" variant (intended for things that are usable in InstanceModels):

class (CoreModelKinds2' e
      -- TODO: Here is where the "code usable" restriction would need to go
      ) => InstantiableModelKinds2' e where

We would need to figure out a constraint on things which would allow them to be replaced with code fragments for drasil-code to be able to generate code. I'm not entirely sure of what this constraint should be yet, however, I believe that we should work towards figuring this out so that we can implement this solution, as I believe it to be the more robust solution because this solution:

Solves our extensibility problem in 2 ways:
- Allowing us to define "ModelKinds" anywhere (by creating new types which satisfy the constraints) of either "core model kinds" or "instantiable model kinds".
- It allows us to create new extended kinds of modelkinds (e.g., a new kind of instantiable model kind for a new different set of generators, etc which may require extra things for them to use the models). I think this is the most important bonus of this solution.
Forces us to understand what makes up a "ModelKind" (and it's extensions)
Shortens our code by removing code duplication, and we no longer need to create "ModelKinds" instances for things because those things would become "ModelKinds" themselves (e.g., we wouldn't create any more EquationalModels because the QDefinition itself would be the EquationalModel).
The type variables are removed, so we won't have problems with the "abstractness" nor "expr" requirements of what makes up a "concrete" or "abstract" model. This should also resolve a bit about the "Expr/ModelExpr" confusion that Dr. Carette mentioned. It also removes much of the "express" upgrades of Exprs to ModelExprs for display.

However, it's not without its cons:

We lose out on the type tags of the existing ModelKinds (e.g., EquationalModel, EquationalRealm, etc)
Despite us never having ModelKinds constructors that had more than 1 input requirement, this would make those unknown instances require a wrapper to contain/merge those 2 data types instead of having it done here. This one is arguably not a "con" but a "pro" in disguise, because it could allow us to create reusable generic boxes which satisfy all but a few of the constraints (e.g., satisfying UID, ShortName, ConceptDomain, etc but leaving the Display for the user to define).

Prototype code (which also has a lot of extra discussion and shows more code): https://github.com/balacij/ExprTests/blob/614b84675be56421ec62be49ca017374ed4ff253/src/Theory/ModelKinds.hs#L138-L141 https://github.com/balacij/ExprTests/blob/614b84675be56421ec62be49ca017374ed4ff253/src/Theory/ModelKinds.hs#L155-L157

Final thought

I think that "Design 2" is a potentially good solution to the problems with the existing model kinds, we would just need to figure out how to handle the "usable in code generation" constraint.

All related code can be found at: https://github.com/balacij/ExprTests

balacij commented 3 years ago

I posted this ticket just a bit early. I will add some extra discussion (as comments) later today/tomorrow about the following:

Relation to #2599 (e.g. less duplicate code, and extensible model kinds). An interesting part about Design 2 of ModelKinds + the discussion on #2599 is that GDs and TMs as discussed by Dr. Carette would essentially be versions of "CoreModelKinds", IMs would essentially be "InstantiableModelKinds" (a refined GD and TM of sorts [refined to satisfy the "instantiability" constraint -- ability to be rendered into realizable code fragments, in some capacity]) which are essentially extensible ModelKinds. I'm not entirely sure yet, but this might be hinting towards what refinement could look like in Drasil.
Display language
Discuss why I wrote "at least 6 constraints" -- they're just the core typeclasses currently implemented for our ModelKinds right now. These core 6 should be the least possible requirements for Drasil to be able to build ChunkDBs for systems.
QDefinitions now have 1 type variable for their expression. Should they have a second type variable for the thing that they essentially define, and then be renamed to "Definition"s?
The "type tags" can be regained via naive approaches if we only need them for surface-level things (e.g., String representations/Display representations of the model names).

balacij commented 3 years ago

A variant of this might be well-paired with #2873. Though, I think we can close this ticket, unless there's anything directly salvageable right now.

balacij commented 2 years ago

I was going to make a new ticket regarding the existing type parameter in ModelKinds, but then I remembered this issue.

I didn't finish the ticket completely, but here's the content I had written: **Should `ModelKinds` have a type parameter? If so, for what?** Currently, `ModelKinds` have a type parameter that we use for distinguishing between which variants of a ModelKind are usable in an InstanceModel vs a TheoryModel or a GeneralDefinition. https://github.com/JacquesCarette/Drasil/blob/5dc10b54aefd5c6ddcef0ce15706b95760f6d7ca/code/drasil-theory/lib/Theory/Drasil/ModelKinds.hs#L34-L40 Specifically, an InstanceModel currently carries a `ModelKind Expr`, from which it derives it's own information from for the SRS. The `Expr` is meant to indicate that the underlying `ModelKind` is something that we can generate code for, somehow. In the case of a "QDefinition", this might be okay, but in the case of DEModels, where we might use a `ModelExpr` (within a `RelationConcept`), then having a `Expr` as the type argument isn't very fitting.

Revisiting this ticket, I realized that there's a better way to write "Design 2" as above, it just requires an extra language extension: ConstraintKinds.

I wrote a very small demonstration of it.

The general idea is that we can distill what we want from the underlying models through ```haskell {-# LANGUAGE ConstraintKinds #-} module Lib ( someFunc ) where class Foo a where foo :: a -> String class Bar a where bar :: a -> String data Potato = Potato String String instance Foo Potato where foo (Potato a _) = a instance Bar Potato where bar (Potato _ b) = b type FooBar a = (Foo a, Bar a) prFooBar :: FooBar a => a -> IO () prFooBar a = do print (foo a) print (bar a) class Baz a where baz :: a -> String instance Baz Potato where baz (Potato a b) = a ++ b prFooBarBaz :: (FooBar a, Baz a) => a -> IO () prFooBarBaz a = do prFooBar a print (baz a) -- Below function outputs: -- -- "FOO!" -- "BAR!" -- "FOO!" -- "BAR!" -- "FOO!BAR!" someFunc :: IO () someFunc = do let potato = Potato "FOO!" "BAR!" prFooBar potato prFooBarBaz potato ```

In the example (which I've collapsed to make this post a bit easier to scroll through), we can have the main components of a ModelKind made similar to FooBar (these ModelKinds would give us a basic set of requirements of the underlying fragments), and then we could have InstanceModels or the Grounded Theories have an extra requirement that we should be able to generate code from them somehow (currently, primarily just being able to expose QDefinition Exprs really).

This is a fairly nice resolution to the dubious empty typeclass I had thought of for Design 2 (which I only made do with because I didn't know we needed the extra language extension to be able to make constraint aliases).

JacquesCarette commented 2 years ago

ConstraintKinds are indeed a nice piece of Haskell, and could be part of the solution.

At a higher level, here is how I use the Haskell parts to connect up with what we need to do:

data constructors (whether ADTs or GADTs) are useful when we have multiple potential choices, and we need to discover at the point of use what we have, as these can be queried (by pattern matching)
class methods are useful when we have many different representations, but they all share some core ideas. At the point of use, we just want "the idea" (via a function) and don't care about the details at all.
record fields (with data) are useful when we have a specific representation that's made up of pieces, and we want to access the whole and the pieces
record fields (with all functions) are one way to implement fake objects; they're basically equivalent to (Haskell) class methods

We have different kinds of models explicitly because they contain different kinds of information, and we want to know what that information is, at least for some purposes. There are also other pieces of functionality that explicit wants to abstract over those details!

To me, this says that an ADT-based solution for ModelKinds seems to fit best.

First, why was this type variable introduced? It was introduced to restrict "InstanceModels" to models containing Exprs (which can be totally converted into GOOL well), and the other *Models to ModelExprs (which can't be totally converted into GOOL without manual intervention).

So maybe this is where we erred. Maybe we shouldn't be trying to use Haskell's type system to enforce this? We still need to 'enforce' this, but maybe we're trying too hard?

smiths commented 2 years ago

@JacquesCarette, I like your list of how you use the different parts of Haskell (data constructors, class methods, etc.). Is this advice we should capture somewhere for future Drasil collaborators?

JacquesCarette commented 2 years ago

Probably a good idea. I'd never written that down before, nor even really thought about it consciously. But this issue pushed me to make explicit why I was feeling uncomfortable with some of the proposed solutions.

smiths commented 2 years ago

@balacij, do you know a spot in our wiki that would be good for capturing the above high level advice from @JacquesCarette? If there isn't a current spot for it, should we create a wiki page with a title like "Drasil design guidelines"?

balacij commented 2 years ago

Perhaps the Types and Typeclasses page? Right now, that page is primarily a list of types, but we can retrofit it a bit to contain the more meaningful discussion of types that @JacquesCarette wrote up for us.

smiths commented 2 years ago

Yes @balacij the Types and Typeclasses page sounds like a good home. Maybe a new heading for "Guidelines for Adding New Types and Typeclasses"? @balacij would you mind putting the information on the Types and Typeclasses page?

balacij commented 2 years ago

Yep, that sounds good me. I just posted it. @JacquesCarette's Information Encoding would probably be a good area to link to it as well, or possible moving it into entirely. I'll only add a link at the bottom of the page for now I think (but please let me know what you think!).

smiths commented 2 years ago

Looks good! Thanks @balacij.

balacij commented 2 years ago

To me, this says that an ADT-based solution for ModelKinds seems to fit best.

I thought that I understood what you had written earlier, but now I feel I'm missing something to conclude that ModelKinds should be ADT based. I still see class methods as the solution.

So maybe this is where we erred. Maybe we shouldn't be trying to use Haskell's type system to enforce this? We still need to 'enforce' this, but maybe we're trying too hard?

This is definitely possible. We might've been tempted by something that might've turned to a dead end. Naively, we could split ModelKinds into 2 variants; one for grounded theories, and one that's more "open" for everything else.

JacquesCarette commented 2 years ago

We should circle back on this. We need to understand the problem, so that we can settle on a design.

JacquesCarette / Drasil