siefkenj / unified-latex

Utilities for parsing and manipulating LaTeX ASTs with the Unified.js framework
MIT License
89 stars 22 forks source link

building a bibtex plugin #65

Closed retorquere closed 8 months ago

retorquere commented 9 months ago

I'd like to take a stab at a bib(la)tex plugin. I have an existing parser which works well, but I wouldn't half mind sharing the work of upkeep 😄 . Where can I find information to get me started? Is this a sensible thing to do with unified-latex?

retorquere commented 9 months ago

The ligatures it understands are here

Ah wait -- I had c defined as a macro, and that interferes with expandUnicodeLigatures

retorquere commented 9 months ago

This should give me a decent start. Parsing is bound to be better (certainly cleaner) with unified -- next up is sentence-casing and markup.

retorquere commented 9 months ago

Oh yeah the ancestry would be handy -- that would be handled by standard unist handlers? Or is there something specific in unified-latex for this?

siefkenj commented 9 months ago

You should use unified-latex-util-visit. It will give you fancy visitors.

retorquere commented 9 months ago

How can I tell whether a node is in the first or another argument of the href macro?

siefkenj commented 9 months ago

You need to look at the parent and do indexOf(self) in the argument list.

retorquere commented 8 months ago

Where does self come from? Is that one of the arguments for the visit handler?

siefkenj commented 8 months ago

I made up the word self. You can look up the index of the current node via the info object passed in from visit

retorquere commented 8 months ago

the index is present in the info object but it is always undefined.

retorquere commented 8 months ago

where does the code live that packs the token stream into arguments based on the macros parameter?

siefkenj commented 8 months ago

Look for the gobbleSingleArgument function.

retorquere commented 8 months ago

In visit, can I add something to context for child nodes, like inMathMode?

retorquere commented 8 months ago

In visit, can I replace child nodes? Could I do something like this:

  visit(tree, (node, info) => {
    if (node.type === 'macro' && node.content.match(/^(url|href)$/)) {
      node.args[0].content = [ { type: 'string', content: printRaw(node.args[0].content) } ]
    }
  })

and not mess up the AST? position info would be lost for example.

retorquere commented 8 months ago

It looks like visit starts at the leaves and works up to the root -- is there a way to walk the tree starting at the root and working down?

edit: I suppose that answers my previous question about modifying child nodes in the parent.

siefkenj commented 8 months ago

You can pass in enter and leave functions to visit to control the traversal method. You can mutate during visit, but if you just want to change a specific node, I recommend replaceNode.

retorquere commented 8 months ago

So in this case I would test each node's parent, see if it is a url or href and it is the first argument, and then replace itself.

In the end I will want to achieve something like this -- is there a way to do this replaceNodeing and visiting inside the unified chain?

wrt context -- I need to apply sentencecasing during conversion, which is steered on information I can build top-down. I need this information available on the individual nodes as I'm transforming them. Is the context something I can use for this, or should I just attach it to each node?

retorquere commented 8 months ago

I'm not in this to cause us both frustration. Let's just call this exploration closed.