mity / md4c

C Markdown parser. Fast. SAX-like interface. Compliant to CommonMark specification.
MIT License
756 stars 138 forks source link

Lists containing headers have incorrect marks / mark delimiters #154

Closed rundel closed 3 years ago

rundel commented 3 years ago

Similar and possibly related to #153 but this seems to affect both unordered and ordered lists. Returned value for both types appears to be \001, \002, etc corresponding to the header level.

See examples below:

> parse_md("- # Foo", character())
md_block_doc [flags:]
└── md_block_ul [tight: 1, mark: '\001']
    └── md_block_li
        └── md_block_h [level: 1]
            └── md_text_normal - "Foo"

> parse_md("- ## Foo", character())
md_block_doc [flags:]
└── md_block_ul [tight: 1, mark: '\002']
    └── md_block_li
        └── md_block_h [level: 2]
            └── md_text_normal - "Foo"

> parse_md("- ### Foo", character())
md_block_doc [flags:]
└── md_block_ul [tight: 1, mark: '\003']
    └── md_block_li
        └── md_block_h [level: 3]
            └── md_text_normal - "Foo"

> parse_md("- *Foo*", character())
md_block_doc [flags:]
└── md_block_ul [tight: 1, mark: '-']
    └── md_block_li
        └── md_span_em
            └── md_text_normal - "Foo"

> parse_md("2. ### Foo", character())
md_block_doc [flags:]
└── md_block_ol [start: 2, tight: 1, mark_delimiter: '\003']
    └── md_block_li
        └── md_block_h [level: 3]
            └── md_text_normal - "Foo"

> parse_md("10. ### Foo", character())
md_block_doc [flags:]
└── md_block_ol [start: 10, tight: 1, mark_delimiter: '\003']
    └── md_block_li
        └── md_block_h [level: 3]
            └── md_text_normal - "Foo"

> parse_md("1. Foo\n\n2. # Hello", character())
md_block_doc [flags:]
└── md_block_ol [start: 1, tight: 0, mark_delimiter: '.']
    ├── md_block_li
    │   └── md_block_p
    │       └── md_text_normal - "Foo"
    └── md_block_li
        └── md_block_h [level: 1]
            └── md_text_normal - "Hello"
rundel commented 3 years ago

The same thing seems to also occur with code blocks:

> parse_md("1. ```\n   foo\n   ```\n", character())
md_block_doc [flags:]
└── md_block_ol [start: 1, tight: 1, mark_delimiter: '\001']
    └── md_block_li
        └── md_block_code [info: '', lang: '', fence_char: '`']
            ├── md_text_code - "foo"
            └── md_text_code - "\n"
mity commented 3 years ago

Good catch. Should be fixed now.