fletcher / MultiMarkdown-6

Lightweight markup processor to produce HTML, LaTeX, and more.
https://fletcher.github.io/MultiMarkdown-6/
Other
623 stars 90 forks source link

[Question] After stripping of MARKER_BLOCKQUOTE in tokenization, could we add back MARKER_BLOCKQUOTE when the line/block is finished? #238

Open DivineDominion opened 2 years ago

DivineDominion commented 2 years ago

Another fringe question from me about using libMMD for highlighting :)

TL;DR: What were the observations before you implemented strip_quote_markers_from_block? Could there be another way to solve the problem while retaining the MARKER_BLOCKQUOTE token?

Motivation

For list items, it's trivial to apply a different style to e.g. MARKER_LIST_BULLET. Same with MARKER_H1--MARKER_H6 for headings. This can be useful to apply a less vibrant color to visually "demote" Markdown syntax markers and thus let the real content "pop" more.

Observation

I noticed that MARKER_BLOCKQUOTE is removed from the token tree, though. Looking at the code, it appears the reason for the removal is that this helps with "recursion", i.e. multi-level quotations:

/// Strip leading blockquote markers and non-indent space
/// (for recursively parsing blockquotes)
void strip_quote_markers_from_block(mmd_engine * e, token * block) {

I don't yet follow why this stripping was needed, and what'd happen without it. If you remember and could spare the time to explain real quick, that'd be great.

What I'd like to achieve when I understand the problem better

Given that this is technically necessary to do what libMMD does at the moment, I don't want to change that -- it'd be great though if a MARKER_BLOCKQUOTE would be added back as a post-fix when recursive blockquote parsing is finished. Leave stripping in, but then add it back after whatever the stripping was used for has been completed.

The parser seems to be the best place to apply this because the 'state' ("we're currently parsing a blockquote") is available already.

By now I'm pretty sure that you'll tell me to check the string for > at the beginning of a BLOCK_BLOCKQUOTE instead when consuming the token tree, @fletcher :) And yes, that works of course. But it'd be so much nicer if the MARKER_* tokens were available consistently. Then one didn't have to parse the beginning of the string again to restore the information/the state that the parser already had.

fletcher commented 2 years ago

Short answer - this is something I have looked at and want to change. Did not have an immediate good solution and there are higher priority issues on this and other projects I need to resolve first.

But yes - one way or another the marker needs to be removed to allow proper parsing of what is inside the block quote. Can experiment with putting the tokens back in after parsing, vs changing to a “dead” token type (catching all downstream effects of this might be tough) vs another solution.

Another issue is that block quote markers will pick up styling of surrounding spans (strong/emph/etc) which would ideally not be the case.

I’m open to suggestions and pull requests. Just have to make sure all of the test suite still passes.

On Sat, Feb 19, 2022 at 4:37 AM Christian Tietze @.***> wrote:

Another fringe question from me about using libMMD for highlighting :)

TL;DR: What were the observations before you implemented strip_quote_markers_from_block? Could there be another way to solve the problem while retaining the MARKER_BLOCKQUOTE token? Motivation

For list items, it's trivial to apply a different style to e.g. MARKER_LIST_BULLET. Same with MARKER_H1--MARKER_H6 for headings. This can be useful to apply a less vibrant color to visually "demote" Markdown syntax markers and thus let the real content "pop" more. Observation

I noticed that MARKER_BLOCKQUOTE is removed from the token tree, though. Looking at the code, it appears the reason for the removal is that this helps with "recursion", i.e. multi-level quotations:

/// Strip leading blockquote markers and non-indent space/// (for recursively parsing blockquotes)void strip_quote_markers_from_block(mmd_engine e, token block) {

I don't yet follow why this stripping was needed, and what'd happen without it. If you remember and could spare the time to explain real quick, that'd be great. What I'd like to achieve when I understand the problem better

Given that this is technically necessary to do what libMMD does at the moment, I don't want to change that -- it'd be great though if a MARKER_BLOCKQUOTE would be added back as a post-fix when recursive blockquote parsing is finished. Leave stripping in, but then add it back after whatever the stripping was used for has been completed.

The parser seems to be the best place to apply this because the 'state' ("we're currently parsing a blockquote") is available already.

By now I'm pretty sure that you'll tell me to check the string for > at the beginning of a BLOCKBLOCKQUOTE instead when consuming the token tree, @fletcher https://github.com/fletcher :) And yes, that works of course. But it'd be so much nicer if the MARKER* tokens were available consistently. Then one didn't have to parse the beginning of the string again to restore the information/the state that the parser already had.

— Reply to this email directly, view it on GitHub https://github.com/fletcher/MultiMarkdown-6/issues/238, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAXYKIS72HRASF3PAXWZ2DU35QGHANCNFSM5O2IVVNA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you were mentioned.Message ID: @.***>

-- Fletcher T. Penney @.***