micromark / micromark-extension-directive

micromark extension to support generic directives (`:cite[smith04]`)
https://unifiedjs.com
MIT License
29 stars 16 forks source link

Crash on line endings in nested labels #13

Closed ChristianMurphy closed 2 years ago

ChristianMurphy commented 3 years ago

Initial checklist

Affected packages and versions

2.0.0

Link to runnable example

https://stackblitz.com/edit/node-wh1r9s?file=index.js

Steps to reproduce

Run

import { micromark } from "micromark";
import { directive, directiveHtml } from "micromark-extension-directive";

const content = `::name
:name[
text:name[
]:name]:name:name`;

console.log(
  micromark(content, {
    extensions: [directive()],
    htmlExtensions: [directiveHtml],
  })
);

Expected behavior

HTML is printed in a few seconds or less

Actual behavior

5 minutes of processing and counting, no result. It may eventually return a document, but hasn't yet.

Runtime

Node v16

Package manager

npm v7

OS

Linux

Build and bundle tools

No response

ChristianMurphy commented 3 years ago

reducing the issue a bit further

:name[
:name[
]
]

this may be related to nesting directives rather than links

ChristianMurphy commented 3 years ago

with debugging and dev conditions on

$ DEBUG="*" node --conditions development test.mjs
  micromark main: passing `58` to start +0ms
  micromark position: restore: `{"line":1,"column":1,"offset":0,"_index":0,"_bufferIndex":0}` +1ms
  micromark main: passing `58` to flowStart +0ms
  micromark enter: `chunkFlow` +1ms
  micromark consume: `58` +0ms
  micromark main: passing `110` to flowContinue +0ms
  micromark consume: `110` +0ms
  micromark main: passing `97` to flowContinue +0ms
  micromark consume: `97` +0ms
  micromark main: passing `109` to flowContinue +0ms
  micromark consume: `109` +0ms
  micromark main: passing `101` to flowContinue +0ms
  micromark consume: `101` +0ms
  micromark main: passing `91` to flowContinue +0ms
  micromark consume: `91` +0ms
  micromark main: passing `-4` to flowContinue +0ms
  micromark consume: `-4` +0ms
  micromark position: after eol: `{"line":2,"column":1,"offset":7,"_index":1,"_bufferIndex":-1}` +1ms
  micromark exit: `chunkFlow` +0ms
  micromark position: define skip: `{"line":1,"column":1,"offset":0,"_index":0,"_bufferIndex":-1}` +0ms
  micromark main: passing `58` to start +0ms
  micromark position: restore: `{"line":1,"column":1,"offset":0,"_index":0,"_bufferIndex":0}` +0ms
  micromark main: passing `58` to start +0ms
  micromark enter: `directiveContainer` +0ms
  micromark enter: `directiveContainerFence` +0ms
  micromark enter: `directiveContainerSequence` +0ms
  micromark consume: `58` +0ms
  micromark main: passing `110` to sequenceOpen +0ms
  micromark position: restore: `{"line":1,"column":1,"offset":0,"_index":0,"_bufferIndex":0}` +1ms
  micromark main: passing `58` to start +0ms
  micromark enter: `directiveLeaf` +0ms
  micromark enter: `directiveLeafSequence` +0ms
  micromark consume: `58` +0ms
  micromark main: passing `110` to inStart +0ms
  micromark position: restore: `{"line":1,"column":1,"offset":0,"_index":0,"_bufferIndex":0}` +0ms
  micromark main: passing `58` to start +0ms
  micromark enter: `content` +0ms
  micromark enter: `chunkContent` +0ms
  micromark consume: `58` +0ms
  micromark main: passing `110` to data +0ms
  micromark consume: `110` +0ms
  micromark main: passing `97` to data +0ms
  micromark consume: `97` +0ms
  micromark main: passing `109` to data +0ms
  micromark consume: `109` +0ms
  micromark main: passing `101` to data +0ms
  micromark consume: `101` +0ms
  micromark main: passing `91` to data +0ms
  micromark consume: `91` +0ms
  micromark main: passing `-4` to data +0ms
  micromark exit: `chunkContent` +0ms
  micromark enter: `lineEnding` +0ms
  micromark consume: `-4` +0ms
  micromark position: after eol: `{"line":2,"column":1,"offset":7,"_index":1,"_bufferIndex":-1}` +0ms
  micromark exit: `lineEnding` +0ms
  micromark main: passing `58` to start +0ms
  micromark position: restore: `{"line":2,"column":1,"offset":7,"_index":2,"_bufferIndex":0}` +1ms
  micromark main: passing `58` to thereIsNoNewContainer +0ms
  micromark enter: `chunkFlow` +0ms
  micromark consume: `58` +0ms
  micromark main: passing `110` to flowContinue +0ms
  micromark consume: `110` +0ms
  micromark main: passing `97` to flowContinue +0ms
  micromark consume: `97` +0ms
  micromark main: passing `109` to flowContinue +0ms
  micromark consume: `109` +0ms
  micromark main: passing `101` to flowContinue +0ms
  micromark consume: `101` +0ms
  micromark main: passing `91` to flowContinue +0ms
  micromark consume: `91` +0ms
  micromark main: passing `-4` to flowContinue +0ms
  micromark consume: `-4` +0ms
  micromark position: after eol: `{"line":3,"column":1,"offset":14,"_index":3,"_bufferIndex":-1}` +0ms
  micromark exit: `chunkFlow` +0ms
  micromark position: define skip: `{"line":2,"column":1,"offset":7,"_index":2,"_bufferIndex":-1}` +0ms
  micromark main: passing `58` to start +0ms
  micromark enter: `directiveContainer` +0ms
  micromark enter: `directiveContainerFence` +0ms
  micromark enter: `directiveContainerSequence` +0ms
  micromark consume: `58` +0ms
  micromark main: passing `110` to sequenceOpen +0ms
  micromark position: restore: `{"line":2,"column":1,"offset":7,"_index":2,"_bufferIndex":0}` +0ms
  micromark main: passing `58` to start +0ms
  micromark enter: `directiveLeaf` +0ms
  micromark enter: `directiveLeafSequence` +0ms
  micromark consume: `58` +0ms
  micromark main: passing `110` to inStart +0ms
  micromark position: restore: `{"line":2,"column":1,"offset":7,"_index":2,"_bufferIndex":0}` +0ms
  micromark main: passing `58` to ok +0ms
  micromark position: restore: `{"line":1,"column":7,"offset":6,"_index":1,"_bufferIndex":-1}` +1ms
  micromark main: passing `-4` to contentContinue +0ms
  micromark consume: `-4` +0ms
  micromark position: after eol: `{"line":2,"column":1,"offset":7,"_index":1,"_bufferIndex":-1}` +0ms
  micromark exit: `chunkContent` +0ms
  micromark enter: `chunkContent` +0ms
  micromark main: passing `58` to data +0ms
  micromark consume: `58` +0ms
  micromark main: passing `110` to data +0ms
  micromark consume: `110` +0ms
  micromark main: passing `97` to data +0ms
  micromark consume: `97` +0ms
  micromark main: passing `109` to data +0ms
  micromark consume: `109` +0ms
  micromark main: passing `101` to data +0ms
  micromark consume: `101` +0ms
  micromark main: passing `91` to data +0ms
  micromark consume: `91` +0ms
  micromark main: passing `-4` to data +0ms
  micromark exit: `chunkContent` +0ms
  micromark enter: `lineEnding` +0ms
  micromark consume: `-4` +0ms
  micromark position: after eol: `{"line":3,"column":1,"offset":14,"_index":3,"_bufferIndex":-1}` +0ms
  micromark exit: `lineEnding` +0ms
  micromark main: passing `93` to start +0ms
  micromark position: restore: `{"line":3,"column":1,"offset":14,"_index":4,"_bufferIndex":0}` +0ms
  micromark main: passing `93` to thereIsNoNewContainer +0ms
  micromark enter: `chunkFlow` +0ms
  micromark consume: `93` +0ms
  micromark main: passing `-4` to flowContinue +0ms
  micromark consume: `-4` +0ms
  micromark position: after eol: `{"line":4,"column":1,"offset":16,"_index":5,"_bufferIndex":-1}` +0ms
  micromark exit: `chunkFlow` +0ms
  micromark position: define skip: `{"line":3,"column":1,"offset":14,"_index":4,"_bufferIndex":-1}` +0ms
  micromark main: passing `93` to start +0ms
  micromark position: restore: `{"line":2,"column":7,"offset":13,"_index":3,"_bufferIndex":-1}` +0ms
  micromark main: passing `-4` to contentContinue +0ms
  micromark consume: `-4` +0ms
  micromark position: after eol: `{"line":3,"column":1,"offset":14,"_index":3,"_bufferIndex":-1}` +1ms
  micromark exit: `chunkContent` +0ms
  micromark enter: `chunkContent` +0ms
  micromark main: passing `93` to data +0ms
  micromark consume: `93` +0ms
  micromark main: passing `-4` to data +0ms
  micromark exit: `chunkContent` +0ms
  micromark enter: `lineEnding` +0ms
  micromark consume: `-4` +0ms
  micromark position: after eol: `{"line":4,"column":1,"offset":16,"_index":5,"_bufferIndex":-1}` +0ms
  micromark exit: `lineEnding` +0ms
  micromark main: passing `93` to start +0ms
  micromark position: restore: `{"line":4,"column":1,"offset":16,"_index":6,"_bufferIndex":0}` +0ms
  micromark main: passing `93` to thereIsNoNewContainer +0ms
  micromark enter: `chunkFlow` +0ms
  micromark consume: `93` +0ms
  micromark main: passing `null` to flowContinue +0ms
  micromark exit: `chunkFlow` +0ms
  micromark position: define skip: `{"line":4,"column":1,"offset":16,"_index":6,"_bufferIndex":-1}` +0ms
  micromark main: passing `93` to start +0ms
  micromark position: restore: `{"line":3,"column":2,"offset":15,"_index":5,"_bufferIndex":-1}` +0ms
  micromark main: passing `-4` to contentContinue +0ms
  micromark consume: `-4` +0ms
  micromark position: after eol: `{"line":4,"column":1,"offset":16,"_index":5,"_bufferIndex":-1}` +0ms
  micromark exit: `chunkContent` +0ms
  micromark enter: `chunkContent` +0ms
  micromark main: passing `93` to data +0ms
  micromark consume: `93` +0ms
  micromark main: passing `null` to data +0ms
  micromark exit: `chunkContent` +0ms
  micromark exit: `content` +0ms
  micromark main: passing `58` to start +1ms
  micromark enter: `paragraph` +0ms
  micromark enter: `chunkText` +0ms
  micromark consume: `58` +0ms
  micromark main: passing `110` to data +0ms
  micromark consume: `110` +0ms
  micromark main: passing `97` to data +0ms
  micromark consume: `97` +0ms
  micromark main: passing `109` to data +0ms
  micromark consume: `109` +0ms
  micromark main: passing `101` to data +0ms
  micromark consume: `101` +0ms
  micromark main: passing `91` to data +0ms
  micromark consume: `91` +0ms
  micromark main: passing `-4` to data +0ms
  micromark consume: `-4` +0ms
  micromark position: after eol: `{"line":2,"column":1,"offset":7,"_index":1,"_bufferIndex":-1}` +0ms
  micromark exit: `chunkText` +0ms
  micromark position: define skip: `{"line":2,"column":1,"offset":7,"_index":2,"_bufferIndex":-1}` +0ms
  micromark main: passing `58` to lineStart +0ms
  micromark enter: `chunkText` +0ms
  micromark consume: `58` +0ms
  micromark main: passing `110` to data +0ms
  micromark consume: `110` +0ms
  micromark main: passing `97` to data +0ms
  micromark consume: `97` +0ms
  micromark main: passing `109` to data +0ms
  micromark consume: `109` +0ms
  micromark main: passing `101` to data +0ms
  micromark consume: `101` +1ms
  micromark main: passing `91` to data +0ms
  micromark consume: `91` +0ms
  micromark main: passing `-4` to data +0ms
  micromark consume: `-4` +0ms
  micromark position: after eol: `{"line":3,"column":1,"offset":14,"_index":3,"_bufferIndex":-1}` +0ms
  micromark exit: `chunkText` +0ms
  micromark position: define skip: `{"line":3,"column":1,"offset":14,"_index":4,"_bufferIndex":-1}` +0ms
  micromark main: passing `93` to lineStart +0ms
  micromark enter: `chunkText` +1ms
  micromark consume: `93` +0ms
  micromark main: passing `-4` to data +0ms
  micromark consume: `-4` +0ms
  micromark position: after eol: `{"line":4,"column":1,"offset":16,"_index":5,"_bufferIndex":-1}` +0ms
  micromark exit: `chunkText` +0ms
  micromark position: define skip: `{"line":4,"column":1,"offset":16,"_index":6,"_bufferIndex":-1}` +0ms
  micromark main: passing `93` to lineStart +0ms
  micromark enter: `chunkText` +0ms
  micromark consume: `93` +0ms
  micromark main: passing `null` to data +0ms
  micromark exit: `chunkText` +0ms
  micromark exit: `paragraph` +0ms
  micromark consume: `null` +0ms
  micromark main: passing `null` to afterConstruct +0ms
  micromark consume: `null` +0ms
  micromark consume: `null` +0ms
  micromark main: passing `58` to start +0ms
  micromark enter: `directiveText` +0ms
  micromark enter: `directiveTextMarker` +0ms
  micromark consume: `58` +0ms
  micromark exit: `directiveTextMarker` +0ms
  micromark main: passing `110` to start +1ms
  micromark enter: `directiveTextName` +0ms
  micromark consume: `110` +0ms
  micromark main: passing `97` to name +0ms
  micromark consume: `97` +0ms
  micromark main: passing `109` to name +0ms
  micromark consume: `109` +0ms
  micromark main: passing `101` to name +0ms
  micromark consume: `101` +0ms
  micromark main: passing `91` to name +0ms
  micromark exit: `directiveTextName` +0ms
  micromark enter: `directiveTextLabel` +0ms
  micromark enter: `directiveTextLabelMarker` +0ms
  micromark consume: `91` +0ms
  micromark exit: `directiveTextLabelMarker` +0ms
  micromark main: passing `-4` to afterStart +0ms
  micromark enter: `directiveTextLabelString` +0ms
  micromark enter: `lineEnding` +0ms
  micromark consume: `-4` +0ms
  micromark position: after eol: `{"line":2,"column":1,"offset":7,"_index":1,"_bufferIndex":-1}` +0ms
  micromark exit: `lineEnding` +0ms
  micromark position: define skip: `{"line":2,"column":1,"offset":7,"_index":2,"_bufferIndex":-1}` +0ms
  micromark main: passing `58` to atBreak +0ms
  micromark enter: `chunkText` +0ms
  micromark consume: `58` +1ms
  micromark main: passing `110` to label +0ms
  micromark consume: `110` +0ms
  micromark main: passing `97` to label +0ms
  micromark consume: `97` +0ms
  micromark main: passing `109` to label +0ms
  micromark consume: `109` +0ms
  micromark main: passing `101` to label +0ms
  micromark consume: `101` +0ms
  micromark main: passing `91` to label +0ms
  micromark consume: `91` +0ms
  micromark main: passing `-4` to label +0ms
  micromark exit: `chunkText` +0ms
  micromark enter: `lineEnding` +0ms
  micromark consume: `-4` +0ms
  micromark position: after eol: `{"line":3,"column":1,"offset":14,"_index":3,"_bufferIndex":-1}` +0ms
  micromark exit: `lineEnding` +0ms
  micromark position: define skip: `{"line":3,"column":1,"offset":14,"_index":4,"_bufferIndex":-1}` +0ms
  micromark main: passing `93` to atBreak +0ms
  micromark enter: `chunkText` +0ms
node:internal/process/esm_loader:74
    internalBinding('errors').triggerUncaughtException(
                              ^

AssertionError [ERR_ASSERTION]: expected non-empty token (`chunkText`)
    at Object.exit (./node_modules/micromark/dev/lib/create-tokenizer.js:311:5)
    at label (.-fuzz/node_modules/micromark-extension-directive/dev/lib/factory-label.js:109:15)
    at atBreak (./node_modules/micromark-extension-directive/dev/lib/factory-label.js:87:12)
    at go (./node_modules/micromark/dev/lib/create-tokenizer.js:218:13)
    at main (./node_modules/micromark/dev/lib/create-tokenizer.js:198:11)
    at Object.write (./node_modules/micromark/dev/lib/create-tokenizer.js:124:5)
    at subcontent (./node_modules/micromark-util-subtokenize/dev/index.js:190:17)
    at subtokenize (./node_modules/micromark-util-subtokenize/dev/index.js:82:30)
    at postprocess (./node_modules/micromark/dev/lib/postprocess.js:12:11)
    at micromark (./node_modules/micromark/dev/index.js:37:9) {
  generatedMessage: false,
  code: 'ERR_ASSERTION',
  actual: false,
  expected: true,
  operator: '=='
}
chartinger commented 2 years ago

There seems to be a problem with nested "empty" directives:

  t.equal(
    micromark(':a[\n]', options({'*': h})),
    '<p><a>\n</a></p>',
    'should not choke on a linebreak'
  )

✅ passes but

  t.equal(
    micromark(':a[:b[\n]]', options({'*': h})),
    '<p><a><b>\n</b></a></p>',
    'should not choke on nested directive with linebreak'
  )

❌ fails with the above error


Going further into possible inputs i am not sure what even should happen in this cases:

  t.equal(
    micromark(':a[\\]', options({'*': h})),
    '<p>???</p>',
    'what should happen here?'
  )

❓ results in <p><a></a>[]</p> and

  t.equal(
    micromark(':a[:b[\\\n]]', options({'*': h})),
    '???',
    'what should happen here?'
  )

❌results in the parse error


Maybe not directly related, but nesting seems to have a limit too:

  t.equal(
    micromark(':a[:b[:c[:d[Test]{x="2"}]]]{y="3"}', options({'*': h})),
    '<p><a y="3"><b><c><d x="2">Test</d></c></b></a></p>',
    'should support nesting level greater than 3'
  )

✅works fine but

  t.equal(
    micromark(':a[:b[:c[:d[:e[Test]{x="2"}]]]]{y="3"}', options({'*': h})),
    '<p><a y="3"><b><c><d><e x="2">Test</e></d></c></b></a></p>',
    'should support nesting level greater than 4'
  )

❌fails with:

    operator: equal
    expected: |-
      '<p><a y="3"><b><c><d><e x="2">Test</e></d></c></b></a></p>'
    actual: |-
      '<p><a></a>[<b><c><d><e x="2">Test</e></d></c></b>]{y=&quot;3&quot;}</p>'
wooorm commented 2 years ago

Thanks for the repros @chartinger.

  1. ':a[\n]' — works as expected
  2. ':a[:b[\n]]' — 🐞 yep, that’s a bug!
  3. ':a[\\]' — This is expected. You use a JavaScript escape in the string, to create a slash in markdown: :a[\], which is a markdown escaped ], and as there’s no further ], it can’t be a label. It should yield <p><a></a>[]</p>, which it does, so it works as expected.
  4. ':a[:b[\\\n]]' — The first is an escaped slash in JavaScript, so markdown sees :a[:b[\␤]] (where ␤ represents a line feed), so that would be a hard break (in markdown, <br> in HTML), will investigate 🔎
  5. ':a[:b[:c[:d[Test]{x="2"}]]]{y="3"}', and 6. — it’s unrelated indeed, but the good part is that the last one was just changed. It’s a security measure, but it was changed from 3 to 32: https://github.com/micromark/micromark/commit/5194939a2f2de3bbae06ec225e9a0c4272852a86.
wooorm commented 2 years ago

Thanks all! Released!