phfaist / pylatexenc

Simple LaTeX parser providing latex-to-unicode and unicode-to-latex conversion
https://pylatexenc.readthedocs.io
MIT License
294 stars 36 forks source link

$x$ → 𝑥 or using chars from the unicode block "Mathematical Alphanumeric Symbols" for math #55

Open gamboz opened 3 years ago

gamboz commented 3 years ago

As a possible future enhancement, would it be possible to use the characters of the unicode block "Mathematical Alphanumeric Symbols" when encoding math symbols to text? So that "$x$" would became "𝑥" (U+1D465 MATHEMATICAL ITALIC SMALL X). Maybe with an option to enable/disable the use of this block? Also here. Thanks :-)

phfaist commented 3 years ago

Thanks, that's a good suggestion. I'll try to think about using math alphanumeric symbols in math expressions soon. (I'm a bit hesitant of adding functionality with options in the latex-to-text functions because I'm considering revamping some parts of latex2text and any options I add now will have to be supported later, too.) In the meantime, you can use $\mathit{x}$ which is already supported. An alternative might be to keep math latex as-is, and then somehow plug in some external unicode math renderer, there might be some useful ones around.

gamboz commented 2 years ago

In the hope that it may be useful, here is a piece of code that I use to translate the macro \cal to "MATHEMATICAL SCRIPT CAPITAL XXX".

def replace_macro_cal(node, l2tobj):
    r"""Script letters.

    \cal O  →  𝒪
    \cal a  →  𝒶 (not implemented; NB: "MATHEMATICAL SCRIPT SMALL O" does not exist)
    """
    letter = l2tobj.nodelist_to_text([node.nodeargd.argnlist[0]])
    ucharname = letter.upper()
    if ucharname != letter:
        logging.error(r'Invalid character "\cal %s" at pos %s.',
                      ucharname, node.pos)
        return "ERROR"
    return unicodedata.lookup("MATHEMATICAL SCRIPT CAPITAL "+ucharname)

Of course this must be plugged into the LatexContextDb for the text generation and the macro definition std_macro('cal', '{') must be added to the context db of the parser.