siefkenj / unified-latex

Utilities for parsing and manipulating LaTeX ASTs with the Unified.js framework
MIT License
85 stars 20 forks source link

Problems with \^{} escape #28

Closed udovin closed 6 months ago

udovin commented 1 year ago

When I use \^{} to render symbol ^, I get following result: <span class="macro macro-^"></span>.

Ok, then I added replaceNode for macro ^ and got some interesting things:

$a^b$ // Replaced with replaceNode
a\^{}b // Also replaced with replaceNode

Seems like not expected behaviour. Of course I found fix: check that node doesn't have escapeToken, but this looks bad:

(node: Macro) => {
    if (node.escapeToken !== undefined) {
        return node;
    }
    return s("^");
}

I expect, that ^ in equation is not assumed like macro.

siefkenj commented 1 year ago

Indeed, the ^ in an equation is considered a macro with escapeToken === "". That allows it to take arguments :-).

If you want to match a macro exactly, you can use match.macro(node, m("^")). Or, if you want to only replace macros in text mode, you can look in the info prop that's passed into the visitor. That contains a flag about whether the macro was found in math mode or not.

udovin commented 1 year ago

I think I need to make PR to allow \^{} work as expected out of the box. Also, I haven't run into issues with escaping other special characters, but those are worth checking too.

udovin commented 1 year ago

I finally fixed all cases with the following macro replacement:

"^": (_: Macro, info: VisitInfo) => {
    if (info.context.inMathMode) {
        return undefined;
    }
    if (info.index === undefined || !info.containingArray) {
        return undefined;
    }
    if (info.containingArray.length <= info.index + 1) {
        return undefined;
    }
    const nextToken = info.containingArray[info.index + 1];
    const content = printRaw(nextToken);
    if (content != "{}") {
        return undefined;
    }
    return s("^");
}

I checked following cases:

\^a   // renders as â
\^{a} // renders as â
\^{}a // renders as ^a

As for now I don't know how to fix this in unified-latex out of the box. @siefkenj can you help me?

siefkenj commented 1 year ago

I'm a little confused about what you're doing/where you're hoping for things to change. Are you talking about the ligature code for HTML conversion, or are you trying to make a custom processor?

siefkenj commented 6 months ago

@udovin I'm closing this. You can reopen if you have more information.