asciidoctor / asciimath

Asciimath parser
MIT License
24 stars 16 forks source link

Optionally leave Unicode characters unescaped in MathML output #59

Open skalee opened 3 years ago

skalee commented 3 years ago

This is a feature request. I can provide implementation, but I wanted to discuss things first.


MathML does not enforce any particular encoding, UTF-8 is legal, and non-ASCII characters can be used without escaping them [W3C].

However, this gem always encodes non-ASCII characters using numeric XML character references (e.g. é):

https://github.com/asciidoctor/asciimath/blob/3a4bbab7da1bdcf4b64034692a79a69e161257f2/lib/asciimath/mathml.rb#L234-L249

This is safer as it never depends on parent document's encoding, but on the other hand it hampers readability.

My suggestion is to add :escape_non_ascii option to Expression#to_mathml method which would disable this kind of escaping (of course <, >, and & will be escaped anyway). This option should default to false.

Perhaps similar option could be added to Expression#to_html method.

pepijnve commented 3 years ago

I've added the code to do this already. I'm still trying to figure out how I can let users pass options to to_mathml without breaking backwards compatibility. Looks like I've painted myself into a corner there.

skalee commented 3 years ago

Maybe something like:

class Expression
  def to_mathml(prefix = "", attrs = {}, options = {})
  end
end

or:

class Expression
  def to_mathml(prefix = "", attrs_or_escape_non_ascii = nil, attrs_or_nil = nil)
    attrs = #...
    escape_non_ascii = #...
  end
end

or even:

class Expression
  # use like:
  # expr.with_options(escape_non_ascii: false).to_math_ml
  def with_options(options)
    dup.tap do |new_expr|
      # set some options like disabling escaping on expression, which is quite odd, but at least gives reasonable interface
    end
  end
end

Keyword argument would be even better, but I suppose this gem still supports old good Ruby 1.9 for some reason?