executablebooks / MyST-Parser

An extended commonmark compliant parser, with bridges to docutils/sphinx
https://myst-parser.readthedocs.io
MIT License
755 stars 197 forks source link

Tables contain additional `<p>` paragraph tags #534

Open thibaudcolas opened 2 years ago

thibaudcolas commented 2 years ago

Describe the bug

context

This is the same issue as #533, but for tables. I would like to use MyST’s implementation of GFM tables:

| foo | bar |
| --- | --- |
| baz | bim |

MyST converts this to:

<table class="colwidths-auto table">
<thead>
<tr class="row-odd"><th class="head"><p>foo</p></th>
<th class="head"><p>bar</p></th>
</tr>
</thead>
<tbody>
<tr class="row-even"><td><p>baz</p></td>
<td><p>bim</p></td>
</tr>
</tbody>
</table>

This example comes from the getting started guide.

expectation

I would have expected:

<table>
<thead>
<tr>
<th>foo</th>
<th>bar</th>
</tr>
</thead>
<tbody>
<tr>
<td>baz</td>
<td>bim</td>
</tr>
</tbody>
</table>

This is the output of markdown-it-py, and the markdown-it live demo.

bug

Instead, the <p> tags are added, which means extra vertical spacing and odd content semantics.

problem

I can’t think of a scenario where the extra vertical space, or the semantics, are desirable.

Reproduce the bug

  1. Create a Markdown document with a table
  2. Convert to HTML

List your environment

myst-parser==0.17.0
chrisjsewell commented 2 years ago

Similarly to my response in #533, there is compliance with sphinx/docutils but not quite with CommonMark:

===  ===
foo  bar
===  ===
baz  bim
===  ===

generates

$ rst2pseudoxml.py test.rst
<document source="test.rst">
    <table>
        <tgroup cols="2">
            <colspec colwidth="3">
            <colspec colwidth="3">
            <thead>
                <row>
                    <entry>
                        <paragraph>
                            foo
                    <entry>
                        <paragraph>
                            bar
            <tbody>
                <row>
                    <entry>
                        <paragraph>
                            baz
                    <entry>
                        <paragraph>
                            bim

and

| foo | bar |
| --- | --- |
| baz | bim |

generates

$ myst-docutils-pseudoxml test.rst
<document source="test.rst">
    <table classes="colwidths-auto">
        <tgroup cols="2">
            <colspec colwidth="50.0">
            <colspec colwidth="50.0">
            <thead>
                <row>
                    <entry>
                        <paragraph>
                            foo
                    <entry>
                        <paragraph>
                            bar
            <tbody>
                <row>
                    <entry>
                        <paragraph>
                            baz
                    <entry>
                        <paragraph>
                            bim

trying to remove the paragraph nodes, will likely cause issues with the docutils/sphinx build, that is expecting them.

chrisjsewell commented 2 years ago

again here, one could add hidden=True to the paragraph nodes, but then it is how to have this respected by docutils/sphinx