quarto-dev / quarto-cli

Open-source scientific and technical publishing system built on Pandoc.
https://quarto.org
Other
3.92k stars 324 forks source link

Anomalous HTML line break for inline equations within quotes or followed by punctuation #5920

Open pglpm opened 1 year ago

pglpm commented 1 year ago

Bug description

In HTML output, an inline formula within quotes, for example some text "$a=b$" some text or some text $a=b$, some text may at times get a line break right after the initial quote, or right before the final quote or comma or full-stop.

Steps to reproduce

  1. Save this example document testquotes.qmd:
    
    ---
    title: "Test quotes"
    format:
    html:
    html-math-method: mathjax
    ---

test "$a=b$" test "$a=b$" test "$a=b$" test "$a=b$" test "$a=b$" test "$a=b$" test "$a=b$" test "$a=b$" test "$a=b$" test "$a=b$" test "$a=b$" test "$a=b$" test "$a=b$" test "$a=b$" test "$a=b$" test "$a=b$" test "$a=b$" test "$a=b$" test "$a=b$" test "$a=b$" test "$a=b$" test "$a=b$" test "$a=b$" test "$a=b$" test "$a=b$" test "$a=b$" test "$a=b$" test "$a=b$" test "$a=b$" test "$a=b$" test "$a=b$" test "$a=b$" test "$a=b$" test "$a=b$" test "$a=b$" test "$a=b$" test "$a=b$" test "$a=b$" test "$a=b$" test "$a=b$" test "$a=b$" test "$a=b$" test "$a=b$" test "$a=b$" test "$a=b$"

test $a=b$, test $a=b$, test $a=b$, test $a=b$, test $a=b$, test $a=b$, test $a=b$, test $a=b$, test $a=b$, test $a=b$, test $a=b$, test $a=b$, test $a=b$, test $a=b$, test $a=b$, test $a=b$, test $a=b$, test $a=b$, test $a=b$, test $a=b$, test $a=b$, test $a=b$, test $a=b$, test $a=b$, test $a=b$, test $a=b$, test $a=b$, test $a=b$, test $a=b$, test $a=b$, test $a=b$, test $a=b$, test $a=b$, test $a=b$, test $a=b$, test $a=b$, test $a=b$, test $a=b$, test $a=b$, test $a=b$, test $a=b$, test $a=b$, test $a=b$, test $a=b$, test $a=b$,


2. Render the document to HTML:

quarto render testquotes.qmd --to html


3. Open the rendered HTML file `testquotes.html` in a browser. **Browsers tested: Firefox 114.0.1 (64-bit), Chromium 114.0.5735.106 (Official Build) snap (64-bit)** on Ubuntu 20.04.

4. Slowly resize the browser window, enlarging or shrinking it, paying attention at where the adapted line breaks appear. You should be able to see the situation as in this screenshot; see the purple squares:
![html_output](https://github.com/quarto-dev/quarto-cli/assets/9976691/525cd3c5-f4fb-4410-b8f0-4516b1856cbc)

### Expected behavior

Line breaks shouldn't occurr after opening quotes, or before closing quotes or commas or full-stops.

### Actual behavior

Line breaks sometimes occurr after opening quotes, or before closing quotes or commas or full-stops around an inline equation.

### Your environment

- IDE: GNU Emacs 28.1 (build 1, x86_64-pc-linux-gnu, GTK+ Version 3.24.20, cairo version 1.16.0) of 2022-07-28
- OS: Ubuntu 20.04 64-bit, Kernel Version: 5.14.0-1059-oem

### Quarto check output

[✓] Checking versions of quarto binary dependencies... Pandoc version 3.1.1: OK Dart Sass version 1.55.0: OK [✓] Checking versions of quarto dependencies......OK [✓] Checking Quarto installation......OK Version: 1.3.353 Path: /opt/quarto/bin

[✓] Checking basic markdown render....OK

[✓] Checking Python 3 installation....OK Version: 3.8.10 Path: /usr/bin/python3 Jupyter: (None)

  Jupyter is not available in this Python installation.
  Install with python3 -m pip install jupyter

[✓] Checking R installation...........OK Version: 4.3.0 Path: /usr/lib/R LibPaths:

[✓] Checking Knitr engine render......OK

pglpm commented 1 year ago

This may be a bug of MathJax rather than Quarto. But I wonder whether a workaround like the one mentioned here could be implemented.

pglpm commented 1 year ago

In case it may help others with the same problem, the other workaround mentioned here works:

To ensure that

test "$a=b$" test

doesn't get line breaks after the first quotation mark or before the last, add a span this way:

test ["$a=b$"]{style="display:inline-block;"} test

I don't know if this could be implemented in Quarto by default.

pglpm commented 1 year ago

This bug is still present in the latest version of quarto (1.3.433). I was wondering if there are any news about it, and if anyone confirms it. It is quite cumbersome to have to use the workaround described in this comment whenever there's punctuation following or preceding inline maths. Cheers!

mcanouil commented 1 year ago

@pglpm As you can see, the issue is still open and not attached to a PR at this point.

mcanouil commented 1 year ago

To note, this is not a bug, it's HTML. You want browsers to read “<span class="math inline">\(a=b\)</span>” as one word which is definitely not, thus the need to use a span <span style="display:inline-block;">“<span class="math inline">\(a=b\)</span>”</span> (described by yourself).

I am almost certain there are no better way to do this, i.e., your "workaround" is not a workaround but the actual solution. Overall, this question is 100 % HTML.

pglpm commented 1 year ago

@mcanouil Thank you for the explanation, I understand. I wonder if the following solution could be implemented in Quarto:

  1. punctuation adjacent to an inline math block ($...$) should be put within the math-inline span when translating to HTML; for example <span class="math inline">“\(a=b\)”</span>;
  2. a style="display:inline-block;" should be added to the span.

So for example "$a=b$" in a .qmd file would be rendered as

<span class="math inline" style="display:inline-block;">“\(a=b\)”</span>

From some experiments it seems to solve the problem.

But I know very little about how this works – maybe it's all pandoc doing this, outside of what Quarto can do – so it may not make sense. Apologies in that case!

mcanouil commented 1 year ago

Seems very unlikely, because why not add other marks, such as ending ? or !, etc. Also, here you are using MathJax, but Quarto offers the ability to use various math libraries which do not behave the same way.

Here the output is done by Pandoc writer quarto pandoc index.qmd --from markdown --to html --mathjax and as I said in the end it comes down to HTML rules.

I really don't think something general can be done here and in my opinion, putting quotes around equation is a bit unorthodox if I may.

To note, you could write a Lua filter to do the surrounding span.

pglpm commented 1 year ago

I see, thank you for one more explanation. I'll check out the other math libraries. I had to use MathJax because other libraries don't offer some options such as Italic Greek uppercase letters and similar.

It isn't so much about quotation marks than about commas and full-stops. These are quite common after inline maths if you're giving a list of variables, for instance, or ending a sentence with a variable name. The problem appears in these cases too.

I'll think about Lua, but going to such lengths would make using Quarto pointless in my use case – I could go back to use LaTeX and produce a pdf with all the desired details and behaviour.

mcanouil commented 1 year ago

Here is a really simple example of a Lua filter which embed any Quoted element which contains inline math into a span.

See https://quarto.org/docs/extensions/filters.html for the filter documentation.

return {
  {Quoted = function (elem) 
    for _, inline in ipairs(elem.content) do
      if inline.t == "Math" then
        return pandoc.Span(elem, {style = "display:inline-block;" })
      end
    end
    return elem
  end}
}
pglpm commented 1 year ago

Cheers! I'll try to adapt it to general adjacent punctuation. Side comment: great short tutorial about Lua filters in the link you shared! Have never used Lua, only heard the name.

mcanouil commented 1 year ago

The issue for punctuation marks is that you need to check the next element to now what it is then embed N and N+1 element together (it's possible but less easy). Quoted is a special element.

Quarto documentNative AST
````qmd --- format: native --- test "$a=b$" test $a=b$. "something else" ```` ```txt Pandoc Meta { unMeta = fromList [] } [ Para [ Str "test" , Space , Quoted DoubleQuote [ Math InlineMath "a=b" ] ] , Para [ Str "test" , Space , Math InlineMath "a=b" , Str "." ] , Para [ Quoted DoubleQuote [ Str "something" , Space , Str "else" ] ] ] ```
pglpm commented 1 year ago

Thank you again! I'll explore this. Actually maybe I should contact Pandoc about this.

I don't know if you think it's worth keeping this issue ticket open or not. Feel free to close it if you think it isn't useful for Quarto dev.

mcanouil commented 1 year ago

I think I managed to cover all cases, @pglpm could you try it out? (Maybe it could be integrated in Quarto internal Lua filters.)

return {
  {Quoted = function (elem)
    for _, el in ipairs(elem.content) do
      if el.t == "Math" then
        return pandoc.Span(elem, {style = "display:inline-block;" })
      end
    end
    return elem
  end},
  {Para = function (elem)
    -- quarto.log.output(elem)
    content = elem.content
    for i, el in ipairs(content) do
      if el.t == "Math" then
        if i > 1 then
          -- prev = content[i-1] -- not sure previous string element should be prepend ...
          -- if prev.t == "Str" then
          --   current_elem = {prev, el}
          --   table.remove(content, i-1)
          -- else 
            current_elem = {el}
          -- end
          if i < #content then
            next = content[i+1]
            if next.t == "Str" then
              table.insert(current_elem, next)
              table.remove(content, i+1)
            end
          end
          content[i] = pandoc.Span(current_elem, {style = "display:inline-block;" })
        end
      end
    end
    return pandoc.Para(content)
  end}
}
pglpm commented 1 year ago

Wow thank you. I'll try it out now and get back to you.

pglpm commented 1 year ago

Seems to work perfectly!! It catches quotes, commas, full-stops, and exclamation marks. This is going to save me a lot of typing of brackets and braces, thank you so much!

pglpm commented 1 year ago

Please feel free to close this issue. It's great if it can be implemented in Quarto. Otherwise you can maybe give this lua code in the documentation? I'm sure it'd help many other Quarto users.

Thank you again!

pglpm commented 1 year ago

Hi @mcanouil , I discovered that the lua code has a side effect: there is no newline appearing anymore for display maths. For example,

text
$$
a=b
$$
text

is now rendered with "a=b" inline. The problem does not occur if the paragraph is ended, that is

text

$$
a=b
$$

text

but this introduces some extra vertical space before and after the equation.

I'm trying to reverse-engineer the lua code to see if this can be fixed, will report if I find something.

mcanouil commented 1 year ago

To note the correct/proper syntax is the one with the surrounding empty lines. To make a new line within a paragraph, the syntax is either two spaces at the end or \.

pglpm commented 1 year ago

Noted! I didn't know this.

simonkeys commented 1 month ago

This filter is very helpful! I made a couple of modifications to it.

I definitely only want it to apply to inline math. Also, I need it to include the previous element, to cover putting inline math inside parentheses: ($a=b$). For that case the program logic needed to be adjusted to avoid problems due to modifying content while iterating over it.

So I came up with the following, which seems to do what I want:

return {
  {Quoted = function (elem)
    for _, el in ipairs(elem.content) do
      if el.t == "Math" and el.mathtype == "InlineMath" then
        return pandoc.Span(elem, {style = "display:inline-block;" })
      end
    end
    return elem
  end},
  {Para = function (elem)
    -- quarto.log.output(elem)
    content = elem.content
    new_content = {}
    for i, el in ipairs(content) do
      if el.t == "Math" and el.mathtype == "InlineMath" then
        current_elem = {el}
        if i > 1 then
          prev = content[i-1]
          if prev.t == "Str" then
            table.insert(current_elem, 1, prev)
            table.remove(new_content, #new_content)            
          end
        end
        if i < #content then
          next = content[i+1]
          if next.t == "Str" then
            table.insert(current_elem, next)
            table.remove(content, i+1)
          end
        end
        new_elem = pandoc.Span(current_elem, {style = "display:inline-block;" })
        table.insert(new_content, new_elem)
      else
        table.insert(new_content, el)
      end
    end
    return pandoc.Para(new_content)
  end}
}