lierdakil / pandoc-crossref

Pandoc filter for cross-references
https://lierdakil.github.io/pandoc-crossref/
GNU General Public License v2.0
909 stars 73 forks source link

How to make wrapping `<div>` not break a paragraph? #444

Open UlyssesZh opened 6 days ago

UlyssesZh commented 6 days ago

test.yml:

tableEqns: true
eqnBlockTemplate: |
  <table><tr><td>$$t$$</td><td>$$i$$</td></tr></table>

test.md:

test
$$x$$
test
$$y$$ {#eq:test}
test

Run:

pandoc -f markdown -t html5 -M crossrefYaml=test.yml -F pandoc-crossref test.md

Output:

<p>test <span class="math display"><em>x</em></span> test</p>
<div id="eq:test">
<table>
<tr>
<td>
<span class="math display"><em>y</em></span>
</td>
<td>
<span class="math display">(1)</span>
</td>
</tr>
</table>
</div>
<p>test</p>

Expected:

<p>test <span class="math display"><em>x</em></span> test <div id="eq:test">
<table>
<tr>
<td>
<span class="math display"><em>y</em></span>
</td>
<td>
<span class="math display">(1)</span>
</td>
</tr>
</table>
</div> test</p>

Here is the reason why I need this. I am trying to use CSS to have indents on the first line of each paragraph (like p { text-indent: 2em; }). Generally I want to control whether the texts after a displayed equation should be a new paragraph or not, so I don't want the wrapping <div> to break the paragraph if I don't include an empty line in the Markdown.

In the code snippet above, we can see that the behavior of displayed equation is already expected by me when it does not have a label. Therefore, this is not a bug of pandoc.

paul-kelleher commented 6 days ago

Highly recommend the complex-paragraphs lua filter here. Works for both latex/pdf and docx outputs.

UlyssesZh commented 5 days ago

I need HTML. Also, because there are literally too many "complex paragraphs" because I have many articles with displayed math, it is impractical to mark every complex paragraphs by hand.

lierdakil commented 5 days ago

This is a limitation of Pandoc's document model. A paragraph can't contain divs. So... yeah, practically speaking, you don't.

Side note, when targeting HTML with MathJax or somesuch, tableEqns isn't necessarily what you want. Using \tag (see here) produces generally better results, typographically speaking. In case you weren't already aware.

UlyssesZh commented 5 days ago

What about <span>?

I use KaTeX, so \tag is not available.

lierdakil commented 5 days ago

The problem is, tables are also block-level elements, so those can't be in paragraphs either, as far as Pandoc is concerned.

That being said, try eqnBlockInlineMath: true. It's a dirty hack, but since you're using inline HTML anyway, you're not particularly concerned about those.

lierdakil commented 5 days ago

Ah, no, sorry, that option does something different.

lierdakil commented 5 days ago

Meh. I don't have a ready solution, and I don't have the bandwidth to implement something at the moment.

UlyssesZh commented 5 days ago

Is it convenient to add an additional class (such as class="not-closing-paragraph") to the div whenever it is in the same paragraph as the block or text after it? I can then use something like .not-closing-paragraph + p { text-indent: 0; }.

UlyssesZh commented 5 days ago

The problem is, tables are also block-level elements, so those can't be in paragraphs either, as far as Pandoc is concerned.

Hmm, but this is not table. This is inline raw HTML.

lierdakil commented 5 days ago

You could arguably slap together a lua filter to postprocess pandoc-crossref's output. My brain is toast at the moment, so I won't try to write code, but the idea is to set eqnInlineTemplate: $$e$$☸$$i$$ and then use Lua to basically replace all equations that contain ☸ (or use any other symbol) with your raw HTML block (splitting on the symbol). Should be relatively straightforward? Can't recall from the top of my head if Lua deals with Unicode properly, though, so the Unicode symbol I'm proposing might not be your best bet.

lierdakil commented 5 days ago

add an additional class

That's doable, but, again, no bandwidth to spare at the moment. I'll accept a PR. You'll want to add a class here:

https://github.com/lierdakil/pandoc-crossref/blob/ad9af798037eb6d67994c22c601c8bdaaddd49c3/lib-internal/Text/Pandoc/CrossRef/References/Blocks/Math.hs#L71-L73 (that span is converted to a div elsewhere because tables are block-level elements)

lierdakil commented 5 days ago

(for additional context, the classes are in the second argument of Span, together with id and key-value attributes)

UlyssesZh commented 5 days ago

Thank you for confirming that this is feasible, but I know nothing about Haskell. I will try the method of an additional filter first.

UlyssesZh commented 5 days ago

Errr, <table> simply cannot be nested in <p>. https://stackoverflow.com/a/9852381 Though I imagine I may just use <span> for everything instead. The original issue still exists.

lierdakil commented 4 days ago

Ah. Right, I keep forgetting <p> isn't just a <div> with default styling. But with HTML specifically, flexbox would yield better results anyway.