Closed GarkGarcia closed 4 years ago
Consider \int x^2 dx .
That is not actually how most people write it, because there is not enough space between the x^2 and the dx. So it is common for the LaTeX source to be written \int x^2 \, dx .
But, some people think the "d" in "dx" should be upright, not italics. Some people make a "\d" macro for that.
Is it the decision of the author, or the viewer (or renderer) how to typeset the dx in an integral?
That is, if your table outputs it one way, but someone wants it the other way, are you planning to accommodate that?
I think we did most of the above already @GarkGarcia. MathML and HTML are effectively the same thing so for all practical purposes we can treat them as such. The HTML backend is kind of experiment someone contributed that isn't 100% complete so I wouldn't emphasise that one too much just yet.
::AsciiMath.parse
takes optional parser and color tables.That is, if your table outputs it one way, but someone wants it the other way, are you planning to accommodate that?
Both I think. The idea is to make the tables configurable. Tweaking the parser table is tricky since the semantics change, but tweaking the rendering table should be easy.
The parser symbol table determines how the parser interprets the asciimath input. If you put 'dx' => :dx
in there, you'll get a :dx
symbol in the AST. If you do not have that entry you'll get d
and x
identifiers instead. Just 'd' => :d
will get you symbol :d
and identifier x
.
On the rendering side the symbol table determines what you actually output. As a silly example :dx => '\mymacro'
would get you \mymacro
in your output.
I've intentionally left the symbol table configuration bits out of this gem. Some people might want to load that from CSV, YAML, JSON, ...; others might want to just hardcode it in Ruby code. I don't think a little parser library like this one should impose one choice.
I think we did most of the above already @GarkGarcia
Great! I guess we're only missing the parameter wiring then? Also, the LaTeX renderer does not use a SymbolsTable
.
It wouldn't be that hard of change to implement, but I haven't figured out what the second argument of SymbolsTable.add
is supposed to represent. It looks like it's something essential to the parser, but I don't understand why it is necessary for the renderers.
- Not sure how useful that would be. The MathML and HTML backend just render hex RGB values which is fine. Color names might be more elegant when you look at the source, but I doubt anyone really cares.
Fair enough. As you know, I've been working hard so that the LaTeX renderer produces the most idiomatic and readable code. I still believe this is a relevant issue, but we could fix it later.
I've intentionally left the symbol table configuration bits out of this gem. Some people might want to load that from CSV, YAML, JSON, ...; others might want to just hardcode it in Ruby code. I don't think a little parser library like this one should impose one choice.
I agree, it's better to keep things as simple as possible in here. I see an opportunity to create a CLI utility to handle that kind of think. It could be very useful for command-line scripting.
The idea is to create a richer client for the library. The library's command-line interface is very useful for debugging, but it's a bit limited overall.
I haven't figured out what the second argument of SymbolsTable.add
SymbolTableBuilder
is a little utility class that helps build a frozen Hash. The basic signature for add
is add(*keys, value, type)
. For each value in the keys
Array a Hash entry will be created with value {:value => value, :type => type}
. The precise semantics of the keys, values and types are not specified, that depends concrete the usage.
For the parser table you have entries like
b.add('ii', :italic, :unary)
b.add('->>', 'twoheadrightarrow', :twoheadrightarrow, :symbol)
which results in
{
'ii' => {:value => :italic, :type => :unary},
'->>' => {:value => :twoheadrightarrow, :type => :symbol},
'twoheadrightarrow' => {:value => :twoheadrightarrow, :type => :symbol},
}
The hash keys are the strings the tokeniser recognises. The :type
informs the parser about what type of node it should create. Is the thing I just parsed a symbol, a unary operator, ... The :value
is the value that gets stored in the AST node.
The MathML renderer creates its table like this
b.add(:dx, 'dx', :identifier)
b.add(:and, 'and', :text)
b.add(:minus, "\u2212", :operator)
resulting in
{
:dx => {:value => 'dx, :type => :identifier},
:and => {:value => 'and', :type => :text},
:minus => {:value => "\u2212", :type => : operator},
}
Here hash keys are the Symbol values corresponding to the :value
s from the parser table. The :value
is the text that's going to be written in the output. :type
determines that MathML tag (or HTML CSS class) that get's used.
In the end you don't have to use this SymbolTableBuilder, the parser and MathML backend just expect something Hash like where the values are Hashes with a certain set of keys. SymbolTableBuilder just makes it easier to create that Hash.
In the end you don't have to use this SymbolTableBuilder, the parser and MathML backend just expect something Hash like where the values are Hashes with a certain set of keys. SymbolTableBuilder just makes it easier to create that Hash.
Ohh, I see. Makes sense.
I made LatexBuilder::SYMBOL_TABLE
public and created an additional (optional) parameter in LatexBuilder::initialize
so that users can pass custom symbol tables to the renderer. @pepijnve Could you take a look at #47?
I also renamed LatexBuilder::SYMBOLS_TABLE
and MarkupBuilder::DEFAULT_DISPLAY_SYMBOL_TABLE
to Whatever::SYMBOL_TABLE
for consistency.
The changes we've been working on recently should make it pretty easy to make the parser and the renders extendable. The code is already there, we just need to create some facilities for users to use it and document it.
My proposal is the following:
Make
SymbolsTable
render-agnostic. The idea is thatSymbolsTable
would have a column for MathML, a column for LaTeX and a column for HTML, as well as a column for the AsciiMath expression and a column for the Ruby symbol that represents it::aleph
If a cell is left empty, the renderers would use a (render-specific) default strategy to render the symbol. This would allow users to create extensions by using custom symbol tables.
Want the parser and the renders to handle a custom symbol of yours? Simply create a row for it your symbols table.
Create an optional parameter for
AsciiMath.parse
that represents the symbols table that should be used by the parser. It's default value should be the symbols table currently used by the parser.Create an optional parameter that represents the
ColorTable
that should be used by the parser. It's default value should be the HTML standard color names.Create a optional parameter in
MarkupBuilder.initialize
that represents whichSymbolsTable
should be used when rendering the markup. It's default value should be the defaultSymbolsTable
used by each renderer.Create an optional parameter in
MarkupBuilder.initialize
that represents a map between RGB values and color names.@pepijnve What do you think?