rasendubi / uniorg

An accurate Org-mode parser for JavaScript/TypeScript
https://oleksii.shmalko.com/uniorg
GNU General Public License v3.0
256 stars 24 forks source link

Items in a list should not be wrapped into a paragraph #105

Open venikx opened 4 months ago

venikx commented 4 months ago

The following snippet:

- item1
- item2

Parses it as:

{
  "type": "org-data",
  "contentsBegin": 0,
  "contentsEnd": 15,
  "children": [
    {
      "type": "plain-list",
      "affiliated": {},
      "indent": 0,
      "listType": "unordered",
      "contentsBegin": 0,
      "contentsEnd": 15,
      "children": [
        {
          "type": "list-item",
          "indent": 0,
          "bullet": "- ",
          "counter": null,
          "checkbox": null,
          "contentsBegin": 2,
          "contentsEnd": 8,
          "children": [
            {
              "type": "paragraph",
              "affiliated": {},
              "contentsBegin": 2,
              "contentsEnd": 8,
              "children": [
                {
                  "type": "text",
                  "value": "item1\n"
                }
              ]
            }
          ]
        },
        {
          "type": "list-item",
          "indent": 0,
          "bullet": "- ",
          "counter": null,
          "checkbox": null,
          "contentsBegin": 10,
          "contentsEnd": 15,
          "children": [
            {
              "type": "paragraph",
              "affiliated": {},
              "contentsBegin": 10,
              "contentsEnd": 15,
              "children": [
                {
                  "type": "text",
                  "value": "item2"
                }
              ]
            }
          ]
        }
      ]
    }
  ]
}

Then gets converted to:

<ul>
  <li>
    <p>item1</p>
  </li>
  <li>
    <p>item2</p>
  </li>
</ul>

From what I can tell ox-html does not add an extra paragraph, I have not verified if org-element wraps this list-item in an extra paragraph or not. If not, this should be fixed in uniorg-parse. If org-element wraps it, and ox-html does not, it should be handled in uniorg-rehype.

rasendubi commented 3 months ago

org-element parses these examples as paragraphs, so this should be handled in uniorg-rehype.

It's a bit more complicated than simply removing <p> tags though — tags are only removed if it's a single paragraph, possibly followed by a sublist.

Example:

- item1
- item2
- a longer item
  spanning

  multiple lines has a p tag
- a paragraph followed by a sublist
  - does not have a p tag

...produces:

<ul>
  <li>item1</li>
  <li>item2</li>
  <li>
    <p>a longer item spanning</p>

    <p>multiple lines has a p tag</p>
  </li>
  <li>
    a paragraph followed by a sublist
    <ul>
      <li>does not have a p tag</li>
    </ul>
  </li>
</ul>