siefkenj / unified-latex

Utilities for parsing and manipulating LaTeX ASTs with the Unified.js framework
MIT License
85 stars 20 forks source link

[unified-latex-prettier] Maybe introduce more break point? #25

Open Enter-tainer opened 1 year ago

Enter-tainer commented 1 year ago

Hi! I'm building a math formater for markdown documents and I found this cool project!

I use remark-math to extract math blocks from the document, then format it using unified-latex-prettier. It works flawlessly and the output generally looks good to me.

But there is one case where the output seems weird. By default, 'printWidth' is 80. But I found that sometimes it produces outputs far longer than 80 columns. And this is one example.

\begin{aligned}
  \lbrack x^{n+t}\rbrack (A_{0}(x)B_{0}(x)) & =\sum_{i=0}^{n+t}\left(\lbrack x^{i}\rbrack A_{0}(x)\right)\left(\lbrack x^{n+t-i}\rbrack B_{0}(x)\right) \\
                                            & =\sum_{i=0}^{n+t}a_{i}c^{i^2-(i-t)^2}                                                                     \\
                                            & =c^{-t^2}\sum_{i=0}^{n+t}a_{i}c^{2it}                                                                     \\
                                            & =c^{-t^2}A(c^{2t})
\end{aligned}

I inspect on the code and the generated prettier doc. And I found that line breaks are only feasible at the end of the line. Therefore, prettier is unable to further break this long line.

fill([
  [
    "\\begin{aligned}",
    indent([
      hardline,
      "\\lbrack x^{n+t}\\rbrack (A_{0}(x)B_{0}(x)) & =\\sum_{i=0}^{n+t}\\left(\\lbrack x^{i}\\rbrack A_{0}(x)\\right)\\left(\\lbrack x^{n+t-i}\\rbrack B_{0}(x)\\right) ",
      "\\\\",
      hardline,
      "                                          & =\\sum_{i=0}^{n+t}a_{i}c^{i^2-(i-t)^2}                                                                     ",
      "\\\\",
      hardline,
      "                                          & =c^{-t^2}\\sum_{i=0}^{n+t}a_{i}c^{2it}                                                                     ",
      "\\\\",
      hardline,
      "                                          & =c^{-t^2}A(c^{2t})                                                                                        ",
    ]),
    hardline,
    "\\end{aligned}",
  ],
])

This may because the code simply print the node when the type is string.

I think this can be enhanced by introducing more line breaks. For example, soft line breaks can be added before \left, after \right and around +, =, -. We can also add softline breaks at the end of any macro.

Enter-tainer commented 1 year ago

I can give it a try and send a PR if this is feasible. 🚀

siefkenj commented 1 year ago

@Enter-tainer I'm glad you're finding the project useful!

I think the barrier is not technical. Rather, it is how aligned environments should look. At the moment, aligned environments are forced to align via the & marks, regardless of whether this fits within the character wrapping. I'm honestly not sure what the best behavior is here.

One possible alternative is to break and indent at each & if the line goes too far. Something like

\begin{aligned}
  \lbrack x^{n+t}\rbrack (A_{0}(x)B_{0}(x)) 
    & =\sum_{i=0}^{n+t}\left(\lbrack x^{i}\rbrack A_{0}(x)\right)\left(\lbrack x^{n+t-i}\rbrack B_{0}(x)\right) \\
    & =\sum_{i=0}^{n+t}a_{i}c^{i^2-(i-t)^2}                                                                     \\
    & =c^{-t^2}\sum_{i=0}^{n+t}a_{i}c^{2it}                                                                     \\
    & =c^{-t^2}A(c^{2t})
\end{aligned}

That still causes long lines though... Your suggestion of breaking at \left and \right is a nice idea. It definitely works if the content that is too long is in the last column. When it's in the middle column or the first column, I worry about the readability.

A PR for this would be welcome :-D (Maybe you can add a print option like, --wrapAlignEnvironments ?)

Also, feel free to generate some proposals/examples for line breaking in align/tabular environments.