rasendubi / uniorg

An accurate Org-mode parser for JavaScript/TypeScript
https://oleksii.shmalko.com/uniorg
GNU General Public License v3.0
256 stars 24 forks source link

Using the `:exports` directive on an export block breaks uniorg-parser #83

Closed ispringle closed 1 year ago

ispringle commented 1 year ago

When you set :exports on an export block it breaks uniorg-parser. Given the following block:

#+begin_export html :exports none
<p>Hi</p>
#+end_export

I'd expect that uniorg-parser would just omit this block from the AST similarly to how org-export does. Instead it breaks the parser, looks like the directives are not included in the regex, and it's breaking the parsing of this. https://github.com/rasendubi/uniorg/blob/11ef10311b05ebd4d0dea80a7f4c51d4aa82edcf/packages/uniorg-parse/src/parser.ts#L931C13-L931C13

I do see that directives/switches are in the src block definition, I think this is probably just a copy/paste job to get it into export blocks and probably really it should be included in all blocks. Not sure if you want uniorg to support the full, complex, and customizable array of different keywords you can add in the block preface, but at least let it parse it I'm thinking?

rasendubi commented 1 year ago

Thanks for the bug report! uniorg should never crash. The expected behavior is that it parses this exports block and it's up to the uniorg-rehype to decide whether to export it or not.

One interesting thing about this example is that :exports none doesn't seem to be documented anywhere (how did you find it?). It also doesn't make much sense because it overrides the previously set html tag (which is documented).

Overall, in emacs #+begin_export html :exports none parses and exports the same way as #+begin_export (without backend tag):

#+begin_export
<p>Hi</p>
#+end_export

This form is supported by uniorg and works correctly so you can use it as a workaround until I implement the fix

venikx commented 1 year ago

Sorry to hi-jack this to ask a question to confirm if I understood it correctly (but with source blocks), but if you'd have:

#+begin_source html :exports none 
<p>Hi</p> 
#+end_source

It's only up to uniorg-rehype to decide to include that part or not? In that case, how do you make this distinction when uniorg spits out the following? (there's no reference to the :exports things)

{
  "type": "org-data",
  "contentsBegin": 0,
  "contentsEnd": 58,
  "children": [
    {
      "type": "special-block",
      "affiliated": {},
      "blockType": "source",
      "contentsBegin": 35,
      "contentsEnd": 46,
      "children": [
        {
          "type": "paragraph",
          "affiliated": {},
          "contentsBegin": 35,
          "contentsEnd": 46,
          "children": [
            {
              "type": "text",
              "value": "<p>Hi</p> \n"
            }
          ]
        }
      ]
    }
  ]
}
ispringle commented 1 year ago

One interesting thing about this example is that :exports none doesn't seem to be documented anywhere (how did you find it?).

Hmm, you seem to be right actually. I've always used :exports none to disable the block but it appears this works because it breaks the org-element-export-block-parser.

It also doesn't make much sense because it overrides the previously set html tag (which is documented).

Well it beats having to delete a block when you're testing something. I write entire webpages in orgmode, including all the needed scripts, styles, etc. I usually do so in the form of literate programming, sometimes I've have a dozen or more html export blocks running at the same time and I don't scope all that js it'll over run each other, so this is pretty useful for debugging that sort of stuff.

Overall, in emacs #+begin_export html :exports none parses and exports the same way as #+begin_export (without backend tag)

This is not the case at all for me, running Emacs@29 and Org 9.7-pre. When I add :exports none (actually when I add any argument, :foo bar event) and then export to html the contents of the export block are not displayed.

For example without any arguments this:

Above the export block.

#+begin_export html
<em>In the export block</em>
#+end_export

Below the export block.

yields the following:

<p>
Above the export block.
</p>

<em>In the export block</em>

<p>Below the export block.</p>

When I add any argument to the header such as:

Above the export block.

#+begin_export html :exports none
<em>In the export block</em>
#+end_export

Below the export block.

it yields:

<p>
Above the export block.
</p>

<p>
Below the export block.
</p>

I guess technically this is a bug on orgmode's part then and the unexpected header argument breaks the exporting process.

rasendubi commented 1 year ago

Oh, that's interesting and I think this explains why Uniorg has this bug—because it's a port of org-element parser. The only difference is that org-element fails to parse export block type but Uniorg throws an exception instead.

rasendubi commented 1 year ago

Opened #84 to replicate org-element/org-export behavior