charlesroddie / MathAtom

Structural representations of visual mathematics expressions for use in .Net rendering libraries
MIT License
7 stars 0 forks source link

The Atom types in wpf-math and CSharpMath #2

Open charlesroddie opened 5 years ago

charlesroddie commented 5 years ago

Both libraries use C# and it was decided to stick with C#. The following is F# code because even if we don't compile it, it's good pseudocode.

The wpf-math Atom

type Atom =
    /// representing horizontal row of other atoms, separated by glue.
    | Row of Atom list
    /// single character in specific text style
    | Char of char * textStyle: string option
    /// character that does not depend on text style
    | FixedChar of c:char * fontId:int
    /// base atom with accent above it
    | Accented of Atom * Accent
    /// big delimeter (e.g. brackets)
    | BigDelimiter of delimiter:Atom * size:int // why is this called BigDelimiter not BigAtom?
    /// big operator with optional limits
    | BigOperator of baseAtom:Atom * upperLimit:Atom * lowerLimit:Atom // baseAtom must have type BigOperator
    /// base atom surrounded by delimeters
    | Fenced of baseAtom:Atom * leftDelimiter:Symbol * RightDelimiter:Symbol
    /// fraction, with or without separation line
    | Fraction of numerator:Atom * denominator:Atom * nAlign:XAlignment * dAlign:XAlignment
    /// other atom with horizontal rule above it
    | Overlined of baseAtom:Atom
    /// scripts to attach to other atom
    | Scripts of baseAtom:Atom * subscriptAtom: Atom option * superscriptAtom: Atom option
    /// whitespace
    | Space of width:float<mu> * height:float<mu>
    /// other atom that is not rendered
    | Phantom of baseAtom:Atom * useWidth:bool * useHeight:bool * useDepth:bool
    /// other atom with custom left and right types
    | Typed of atom:Atom * leftType:TexAtomType * rightType:TexAtomType
    /// single character that can be marked as text symbol
    | CharSymbol of isTextSymbol:bool
    /// symbol (non-alphanumeric character)
    | Symbol of name:char * TexAtomType * isDelimiter:bool
    /// other atom with delimeter and script atoms over or under it
    | OverUnderDelimiter of baseAtom:Atom * script:Atom * Symbol * kern:float<mu> * over:bool
    /// other atom that is underlined
    | Underlined of baseAtom:Atom
    /// other atom with atoms optionally over and under it
    | UnderOver of baseAtom:Atom * underOver:Atom * underOverSpace:float<mu>
    /// other atom vertically centered with respect to axis
    | VerticallyCentered of atom:Atom
    /// radical (nth-root) construction
    | Radical of baseAtom:Atom * degreeAtom:Atom
    /// Atom specifying graphical style.
    | Styled of RowAtom* background: Brush * foreground: Brush
    /// Dummy atom whose type can change or which can be replaced by a ligature.
    | Dummy of atom:Atom * isTextSymbol:bool

    /// gets the types of the left and rightmost children or the type of the atom itself if it childless
    member t.GetLR: TexAtomType * TexAtomType = ...
    member t.GetLeftType  = t.GetLR |> fst
    member t.GetRightType = t.GetLR |> snd
    member t.Type:TexAtomType = ...

In CSharp there is an Atom class and subclasses inherit from it.

These classes also contain code to generate a Box, which describes how to lay out and render it. I think this part can be moved into a separate layer and Atom classes just used to give a structural representation.

charlesroddie commented 5 years ago

The CSharpMath MathAtom/IMathList

This is what I could find in the CSharpMath repo but I believe it's incomplete. I am unclear on why there are fewer cases than for wpf-math. Are some Atoms getting created directly through the MathAtom constructor rather than via a subclass constructor? @Happypig375

type Atom =
    | List of Atom list // This is a MathList
    | Accent of Atom
    | Color of Atom * colorString: string
    // What are the delimiters?
    | Fraction of numerator:Atom * denominator:Atom * leftDelimiter:string * rightDelimiter:string * hasRule:bool
    /// This is an Atom surrounded by bounds. Bounds are atoms with MathAtomType.Boundary
    | Inner of inner:Atom * lBound:Atom * rBound:Atom
    | LargeOperator of Atom * limits:bool
    /// Atom followed by a number of primes
    | Primes of numberOfPrimes:int * Atom
    | Radical of degree:Atom * radicand:Atom
    | Space of length:float<mu>
    | Styled of Atom * Style
    /// E.g. a matrix. Each list describes a row. Rows may have different lengths apparently.
    | Table of (Atom list) list * interColumnSpacing:float<mu> * interRowAdditionalSpacing:float<mu> * columnAlignments: ColumnAlignment list
    | Underline of Atom
    | Overline of Atom
    | RaiseBox of raise:float<mu> * inner:Atom

There are also dedicated hashing functions contained in the subclasses whose purpose I am not sure of.

A superscript and subscript option is inside each MathAtom in CSharpMath.

charlesroddie commented 5 years ago

wpf-math enums

type TexDelimiterOverUnder = Over  | Under

[<RequireQualifiedAccess>]
type TexAtomType =
    | None
    | Ordinary
    | BigOperator
    | BinaryOperator
    | Relation
    | Opening
    | Closing
    | Punctuation
    | Inner
    | Accent

// These are combined in wpf-math but seem like they should be separated.
type XAlignment = Left | Right | Center
type YAlignment = Top | Bottom

type TexDelimeter =
    | Brace
    | SquareBracket
    | Bracket
    | LeftArrow
    | RightArrow
    | LeftRightArrow
    | DoubleLeftArrow
    | DoubleRightArrow
    | DoubleLeftRightArrow
    | SingleLine
    | DoubleLine

type TexStyle = Display | Text | Script | ScriptScript
charlesroddie commented 5 years ago

CSharpMath MathAtomType

type MathAtomType =
    | MinValue
    | Ordinary
    | Number
    | Variable
    /// A large operator such as sin/cos, integral, etc.
    | LargeOperator
    | BinaryOperator
    | UnaryOperator
    | Relation
    /// Open brackets
    | Open
    /// Close brackets
    | Close
    | Fraction
    | Radical
    | Punctuation
    /// A placeholder for future input
    | Placeholder
    /// An inner atom, i.e. embedded math list
    | Inner
    | Underline
    | Overline
    | Accent
    | Group // what is a Group?
    | RaiseBox
    | Prime
    | Boundary
    | Space
    ///Style changes during rendering
    | Style
    | Color
    ///A table. Not part of TeX.
    | Table
charlesroddie commented 5 years ago

Once we have a clearer idea about CSharpMath structure we can start to compare.

Happypig375 commented 5 years ago

Will revisit tomorrow.

Happypig375 commented 5 years ago

@charlesroddie

/// big operator with optional limits | BigOperator of baseAtom:Atom upperLimit:Atom lowerLimit:Atom // baseAtom must have type BigOperator

Is 'BigOperator' a typo? This is a circular reference.

I am unclear on why there are fewer cases than for wpf-math. Why are there fewer cases than wpf-math? Are some Atoms getting created directly through the MathAtom constructor rather than via a subclass constructor?

Yes. See:

https://github.com/verybadcat/CSharpMath/blob/e17c49161326f750576be0f5c4f6b3076e9e3f54/CSharpMath/Atoms/Factories/MathAtoms.cs#L12-L274

| Accent of Atom // Where is the accent specified?

https://github.com/verybadcat/CSharpMath/blob/e17c49161326f750576be0f5c4f6b3076e9e3f54/CSharpMath/Atoms/Factories/MathAtoms.cs#L290-L311

// What are the delimiters? | Fraction of numerator:Atom denominator:Atom leftDelimiter:string rightDelimiter:string hasRule:bool

https://github.com/verybadcat/CSharpMath/blob/e17c49161326f750576be0f5c4f6b3076e9e3f54/CSharpMath/Atoms/Factories/MathAtoms.cs#L451-L477

// unclear how LargeOperator can function only knowing the limits exist but not knowing what they are | LargeOperator of Atom * limits:bool

The Superscript and Subscript properties defined in MathAtom.

// This is inside MathAtom in CSharpMath but the wpf-Math approach looks clearly better | Scripts of baseAtom:Atom subscriptAtom: Atom option superscriptAtom: Atom option

But the locations of the scripts are affected by the previous atom, which kind of defeats the purpose of a separate type of atom for scripts. Edit: Revoked.

// unsure what degree is | Radical of Atom degree:Atom radicand:Atom

It's really the index of the radical.

charlesroddie commented 5 years ago

Opinionated comparison table

Equivalent

wpf-math CSharpMath Comment
Row of Atom list MathList of Atom list Row is probably a better name
Overlined of baseAtom:Atom Overline of Atom
Underlined of baseAtom:Atom Underline of Atom
Radical of baseAtom:Atom * degreeAtom:Atom Radical of degree:Atom * radicand:Atom

Take from wpf-math?

wpf-math CSharpMath Comment
Accented of Atom * Accent Accent of Atom I think the CSM Accent is just the accent and not the thing accented.
BigOperator of baseAtom:Atom * upperLimit:Atom * lowerLimit:Atom LargeOperator of Atom * limits:bool replace baseAtom with an Operator enum in wpf-math?
Fenced of baseAtom:Atom * leftDelimiter:Symbol * RightDelimiter:Symbol Inner of inner:Atom * lBound:Atom * rBound:Atom Symbol more specific than Atom
Fraction of numerator:Atom * denominator:Atom * nAlign:XAlignment * dAlign:XAlignment Fraction of numerator:Atom * denominator:Atom * leftDelimiter:string * rightDelimiter:string * hasRule:bool
Scripts of baseAtom:Atom * subscriptAtom: Atom option * superscriptAtom: Atom option a subscript and superscript Atom option is inside every atom

Take from CSharpMath?

wpf-math CSharpMath Comment
Table of (Atom list) list * interColumnSpacing:float<mu> * interRowAdditionalSpacing:float<mu> * columnAlignments: ColumnAlignment list CSharpMath supports matrices.
Space of width:float<mu> * height:float<mu> Space of length:float<mu> Don't think we need vertical space in math mode?

Discuss

wpf-math CSharpMath Comment
Char of char * textStyle: string option

FixedChar of c:char * fontId:int

CharSymbol of isTextSymbol:bool

Symbol of name:char * TexAtomType * isDelimiter:bool

UnderOver of baseAtom:Atom * underOver:Atom * underOverSpace:float<mu> not sure what this is
OverUnderDelimiter of baseAtom:Atom * script:Atom * Symbol * kern:float<mu> * over:bool not sure what this is
BigDelimiter of delimiter:Atom * size:int replace delimiter with a Delimiter enum?
RaiseBox of raise:float<mu> * inner:Atom
Primes of numberOfPrimes:int * Atom
Styled of RowAtom* background: Brush * foreground: Brush Styled of Atom * Style

Color of Atom * colorString: string

Phantom of baseAtom:Atom * useWidth:bool * useHeight:bool * useDepth:bool Is this the same as transparent color? What happens when useHeight or useDepth is false
VerticallyCentered of atom:Atom

Remove?

wpf-math CSharpMath Comment
Typed of atom:Atom * leftType:TexAtomType * rightType:TexAtomType a hack in wpf-math?
Dummy of atom:Atom * isTextSymbol:bool
Happypig375 commented 5 years ago

CSharpMath has Overline atom too.

Happypig375 commented 5 years ago

But the locations of the scripts are affected by the previous atom, which kind of defeats the purpose of a separate type of atom for scripts.

Statement revoked. After looking into its source in wpf-math, it does seem to be better.

Char of char * textStyle: string option

Equivalent to a MathAtom with MathAtomType.Ordinary, MathAtomType.Number and MathAtomType.Variable.

FixedChar of c:char * fontId:int

I don't really like the idea of coupling fonts with characters.

CharSymbol of isTextSymbol:bool

Do we really need an abstraction of text atoms here?

Symbol of name:char * TexAtomType * isDelimiter:bool

Equivalent to a MathAtom with MathAtomType.Ordinary, MathAtomType.Number and MathAtomType.Variable. CSharpMath does not separate text from symbols.

I think the CSM Accent is just the accent and not the thing accented.

It is really a group with the accent and accentee inside.

replace delimiter with a Delimiter enum?

It could be equivalent to the Boundary atom of CSharpMath.

charlesroddie commented 5 years ago

I don't really like the idea of coupling fonts with characters.

I agree. Let's discuss fonts and styles here: https://github.com/charlesroddie/MathAtom/issues/3

Do we really need an abstraction of text atoms here?

It's open for discussion whether the wpf-math cases are right here. I don't have an opinion at the moment. But we should have cases so that everything is covered by the right case. Then the information is contained transparently in the tree structure instead of hidden in nullable fields of Atom which are only valid in certain cases.

CSharpMath has the following properties in every Atom:

public string Nucleus { get; set; }
private IMathList _Superscript;
private IMathList _Subscript;
public FontStyle FontStyle { get; set; }
public Range IndexRange { get; set; }
public List<IMathAtom> FusedAtoms { get; set; }

This is why there are fewer subclasses in CSharpMath. But I prefer a complete subclass approach which would remove these properties.

Happypig375 commented 5 years ago

Right. I didn't notice the amount of redundant properties before. Guess I don't visit the code of MathAtom often enough. I'll try incorporating the wpf-math approach here.

Happypig375 commented 5 years ago

At the same time, I would want to get rid of the MathAtomType enum and replace it with type pattern switching on the MathAtom discriminated union. I want to apply the DRY principle here. I'll try writing an analyzer for switching on discriminated unions myself.

charlesroddie commented 5 years ago

At the same time, I would want to get rid of the MathAtomType enum and replace it with type pattern switching on the MathAtom discriminated union.

Right we want the right C# model of DUs. Option A: an enum for all the cases (so finer than MathAtomType):

enum MathAtomCase {Row, Overlined, Underlined, Radical,... };

public class MathAtom { public MathAtomCase Case; };

public class Row:MathAtom
{
    public List<Atom> Atoms;
    Case = MathAtomCase.Row
}
...

public Box Layout(MathAtom a) {
    switch (a.Case)
    case MathAtomCase.Row
        Row r = (Row) a
        return ... // something using r
    case ...

But I gather that C# doesn't give any warnings on incomplete enum switches. If it does then this would be the best approach. But if it doesn't then the enums don't give us much. Option B: forget the enums:

public class MathAtom { };

public class Row:MathAtom
{   public List<Atom> Atoms; }
...

public Box Layout(MathAtom a) {
    if a is Row r
        return ... // something using r
    elif a is ...
    ...

(I may have got the some C# syntax wrong.)

Were you considering either of these approaches?

Happypig375 commented 5 years ago

Enhanced Option B:

abstract class MathAtom {
  private MathAtom() { }
  ...
  //Inner class
  class Row : MathAtom {
    ...
  }
  class Color : MathAtom {
    ...
  }
}

Plus a [DiscriminatedUnionAttribute] and a custom Roslyn analyzer to enforce the completeness of the switch.

charlesroddie commented 5 years ago

I tried an F# DU Atom and a C# one similar to yours and they look identical to consuming C# code.