micromark / micromark-extension-directive

micromark extension to support generic directives (`:cite[smith04]`)
https://unifiedjs.com
MIT License
29 stars 16 forks source link

parser confusion on directive followed by list in a list #3

Closed ChristianMurphy closed 3 years ago

ChristianMurphy commented 3 years ago

Subject of the issue

const micromark = require("micromark/lib");
const directive = require("micromark-extension-directive");

micromark(`:::i
- +
a`, {
  extensions: [directive()],
});

throws

node:assert:385
    throw err;
    ^

AssertionError [ERR_ASSERTION]: expected a previous token
{
  generatedMessage: false,
  code: 'ERR_ASSERTION',
  actual: undefined,
  expected: true,
  operator: '=='
}

Your environment

Steps to reproduce

run:

const micromark = require("micromark/lib");
const directive = require("micromark-extension-directive");

micromark(`:::i
- +
a`, {
  extensions: [directive()],
});

Expected behavior

No error, or a more specific markdown syntax related error

Actual behavior

node:assert:385
    throw err;
    ^

AssertionError [ERR_ASSERTION]: expected a previous token
{
  generatedMessage: false,
  code: 'ERR_ASSERTION',
  actual: undefined,
  expected: true,
  operator: '=='
}
ChristianMurphy commented 3 years ago

Tracing this some more

this does not appear to be the missing previous https://github.com/micromark/micromark-extension-directive/blob/67126633ba8a59c8539006e218f0552c14003c5d/lib/tokenize-directive-container.js#L113-L118 it's getting a value (with the exception of the first iteration, which appears to be expected (?) )

nor does https://github.com/micromark/micromark-extension-directive/blob/67126633ba8a59c8539006e218f0552c14003c5d/lib/tokenize-directive-text.js#L31 which has a code a expected

https://github.com/micromark/micromark/blob/main/lib/tokenize/list.mjs doesn't appear to have a previous token requirement(?)

wooorm commented 3 years ago

Interesting. So, without the container we have a list item that start with a blank line in a list. That blank line starting the list item means it can’t start lazy continuation, so a is not in it, but thus a sibling to the parent list.

<ul>
<li>
<ul>
<li></li>
</ul>
</li>
</ul>
<p>a</p>
wooorm commented 3 years ago

The “unraveling” of tokens is a bit complex. Because markdown is indent based, some spaces or markers are already parsed by a container (e.g., * and on the next line ` (two spaces) for a fictional list), and then the rest of the lines (say,a\nandbstill have to be parsed). And because we have a flat list of tokens, a paragraph then has to start beforea, and end afterb, leading to the complex nesting. If a developer (so, me here) makes a mistake,expected a previous token` is thrown: https://github.com/micromark/micromark/blob/ac44b027357e36694efd2c59babba1b89515e73c/lib/util/subtokenize.mjs#L164

wooorm commented 3 years ago

This might all be just because markdown (or no other micromark extension) has something that can contain containers. Because this is sort of like fenced code, which wrap them 🤔

ChristianMurphy commented 3 years ago

has something that can contain containers

I was thinking the same thing, the thing that confuses me.

:::a
*
a

parses fine

as does

:::a
* a

and

:::a
* -

even

:::a
* - a
a

it's oddly specific to empty nested containers, with a following paragraph.

wooorm commented 3 years ago

Can also produce it with :::i\n> >\na but not :::i\n> \na

ChristianMurphy commented 3 years ago

interesting

:::a
> >
>

parses

:::a
> > >
> >

parses as well but,

:::a
> > >
>

fails

2+ layers of nesting of empty seems to be the constant

wooorm commented 3 years ago

wellll... this is weird 🙄

Corvimae commented 3 years ago

I'm seeing a similar issue:

:::card
Hello!

- This
  - Is
  - A
  - List

**Some post list content.**
:::
TypeError: token is undefined
    subcontent subtokenize.js:179
    subtokenize subtokenize.js:60
    postprocess postprocess.js:6
    fromMarkdown index.js:25
    parse index.js:13
    parse index.js:284
    pipelineParse index.js:23
    wrapped wrap.js:25
    next index.js:57
    run index.js:31
    executor index.js:375
    process index.js:370
    processSync index.js:399

If if ends on a first-level list item, it works fine:

:::card
Hello!

- This
  - Is
  - A
  - List
-

**Some post list content.**
:::
wooorm commented 3 years ago

Looking into it again. Re-iterating https://github.com/micromark/micromark-extension-directive/issues/3#issuecomment-748538844: the document content type which contains containers such as lists and block quotes, which in normal markdown is the top-level content type, is allowed here as a child content type (of the directive). I’m assuming something weird is going on when it’s not the top-level thing

github-actions[bot] commented 3 years ago

Hi! This was closed. Team: If this was fixed, please add phase/solved. Otherwise, please add one of the no/* labels.