markdown-it-rust / markdown-it

markdown-it js library rewritten in rust
Other
79 stars 9 forks source link

implement try_run and error propagation #19

Open rlidwka opened 1 year ago

rlidwka commented 1 year ago

see discussion in https://github.com/rlidwka/markdown-it.rs/issues/11

This PR adds a new function MarkdownIt::try_parse(), which propagates errors from user plugins.

As an example of one such error, syntect plugin now reports non-existent language tag in try_parse() (but still returns correct output when running parse()):

let mut md = MarkdownIt::new();
markdown_it::plugins::cmark::add(&mut md);
markdown_it::plugins::extra::syntect::add(&mut md);

let input = r#"
``` non-existent-language
hello

"#;

let html = md.parse(input).render(); dbg!(html); // html = "<pre style=\"background-color:#ffffff;\">\n<span style=\"color:#323232;\">hello\n\n"

let html = md.try_parse(input).map(|ast| ast.render()); dbg!(html); // html = Err( // Error { // context: "core rule SyntectRule returned an error", // source: "syntax not found for language non-existent-language", // }, //)

codecov-commenter commented 1 year ago

Codecov Report

Patch coverage: 90.18% and project coverage change: -0.06 :warning:

Comparison is base (e4b6beb) 94.12% compared to head (c1bdfba) 94.07%.

:exclamation: Current head c1bdfba differs from pull request most recent head 32b33e8. Consider uploading reports for the commit 32b33e8 to get more accurate results

Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #19 +/- ## ========================================== - Coverage 94.12% 94.07% -0.06% ========================================== Files 57 57 Lines 2453 2685 +232 ========================================== + Hits 2309 2526 +217 - Misses 144 159 +15 ``` | [Impacted Files](https://app.codecov.io/gh/rlidwka/markdown-it.rs/pull/19?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Alex+Kocharin) | Coverage Δ | | |---|---|---| | [src/plugins/extra/heading\_anchors.rs](https://app.codecov.io/gh/rlidwka/markdown-it.rs/pull/19?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Alex+Kocharin#diff-c3JjL3BsdWdpbnMvZXh0cmEvaGVhZGluZ19hbmNob3JzLnJz) | `0.00% <0.00%> (ø)` | | | [src/plugins/extra/linkify.rs](https://app.codecov.io/gh/rlidwka/markdown-it.rs/pull/19?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Alex+Kocharin#diff-c3JjL3BsdWdpbnMvZXh0cmEvbGlua2lmeS5ycw==) | `95.38% <83.33%> (-1.34%)` | :arrow_down: | | [src/parser/block/mod.rs](https://app.codecov.io/gh/rlidwka/markdown-it.rs/pull/19?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Alex+Kocharin#diff-c3JjL3BhcnNlci9ibG9jay9tb2QucnM=) | `85.88% <84.78%> (-2.76%)` | :arrow_down: | | [src/parser/block/builtin/block\_parser.rs](https://app.codecov.io/gh/rlidwka/markdown-it.rs/pull/19?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Alex+Kocharin#diff-c3JjL3BhcnNlci9ibG9jay9idWlsdGluL2Jsb2NrX3BhcnNlci5ycw==) | `91.30% <86.66%> (-8.70%)` | :arrow_down: | | [src/parser/main.rs](https://app.codecov.io/gh/rlidwka/markdown-it.rs/pull/19?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Alex+Kocharin#diff-c3JjL3BhcnNlci9tYWluLnJz) | `87.93% <86.84%> (-4.07%)` | :arrow_down: | | [src/parser/inline/mod.rs](https://app.codecov.io/gh/rlidwka/markdown-it.rs/pull/19?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Alex+Kocharin#diff-c3JjL3BhcnNlci9pbmxpbmUvbW9kLnJz) | `84.94% <87.75%> (-1.33%)` | :arrow_down: | | [src/plugins/sourcepos.rs](https://app.codecov.io/gh/rlidwka/markdown-it.rs/pull/19?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Alex+Kocharin#diff-c3JjL3BsdWdpbnMvc291cmNlcG9zLnJz) | `93.75% <88.88%> (-6.25%)` | :arrow_down: | | [src/parser/inline/builtin/inline\_parser.rs](https://app.codecov.io/gh/rlidwka/markdown-it.rs/pull/19?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Alex+Kocharin#diff-c3JjL3BhcnNlci9pbmxpbmUvYnVpbHRpbi9pbmxpbmVfcGFyc2VyLnJz) | `95.23% <90.90%> (-4.77%)` | :arrow_down: | | [src/common/ruler.rs](https://app.codecov.io/gh/rlidwka/markdown-it.rs/pull/19?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Alex+Kocharin#diff-c3JjL2NvbW1vbi9ydWxlci5ycw==) | `85.04% <100.00%> (+6.29%)` | :arrow_up: | | [src/common/typekey.rs](https://app.codecov.io/gh/rlidwka/markdown-it.rs/pull/19?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Alex+Kocharin#diff-c3JjL2NvbW1vbi90eXBla2V5LnJz) | `100.00% <100.00%> (ø)` | | | ... and [9 more](https://app.codecov.io/gh/rlidwka/markdown-it.rs/pull/19?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Alex+Kocharin) | |

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.

begleynk commented 1 year ago

This looks really cool!

Wanted to clarify something from my comment in #11: At least in my case, even if there are some errors in my plugins, it may not stop us from rendering something. The node itself could render its error into the HTML, but it's still nice to be able to know in the return type that an error occurred. Also, there could be multiple errors in the document, and it would be nice to get them all in one pass. This unfortunately doesn't really fit into the Rust Result model where you return early at the first error encountered.

This may be a bit of a specialized use case, so I understand if it's something you'd not want to add to the library - there's ways around it using the old parse function, but thought I'd flag it in case you found it useful.

Thanks for all your work on this library!

rlidwka commented 1 year ago

Also, there could be multiple errors in the document, and it would be nice to get them all in one pass.

That use-case is already supported with parse() function, see code below. You can accumulate errors as metadata in AST, then collect them all afterwards.

Note that there's a bug currently with inline rules, which I just fixed in master branch.

use markdown_it::parser::block::{BlockRule, BlockState};
use markdown_it::parser::core::CoreRule;
use markdown_it::parser::extset::NodeExt;
use markdown_it::parser::inline::{InlineRule, InlineState};
use markdown_it::{MarkdownIt, Node};

#[derive(Debug, Default)]
struct NodeErrors(Vec<anyhow::Error>);
impl NodeExt for NodeErrors {}

struct MyInlineRule;
impl InlineRule for MyInlineRule {
    const MARKER: char = '@';

    fn run(state: &mut InlineState) -> Option<(Node, usize)> {
        let err = state.node.ext.get_or_insert_default::<NodeErrors>();
        err.0.push(anyhow::anyhow!("inline error"));
        None
    }
}

struct MyBlockRule;
impl BlockRule for MyBlockRule {
    fn run(state: &mut BlockState) -> Option<(Node, usize)> {
        let err = state.node.ext.get_or_insert_default::<NodeErrors>();
        err.0.push(anyhow::anyhow!("block error"));
        None
    }
}

struct MyCoreRule;
impl CoreRule for MyCoreRule {
    fn run(root: &mut Node, _md: &MarkdownIt) {
        let err = root.ext.get_or_insert_default::<NodeErrors>();
        err.0.push(anyhow::anyhow!("core error"));
    }
}

fn main() {
    let md = &mut markdown_it::MarkdownIt::new();
    markdown_it::plugins::cmark::add(md);

    md.inline.add_rule::<MyInlineRule>();
    md.block.add_rule::<MyBlockRule>();
    md.add_rule::<MyCoreRule>();

    let text1 = r#"*hello @world*"#;
    let ast = md.parse(text1);

    println!("parsed as:");
    println!("{}", ast.render());

    println!("errors:");
    ast.walk(|node, _| {
        if let Some(errors) = node.ext.get::<NodeErrors>() {
            for error in errors.0.iter() {
                println!(" - {error:?}");
            }
        }
    });
}

Of course the problem with that approach is that errors are not standardized, i.e. two plugins from different authors would report errors in different ways.

rlidwka commented 1 year ago

This PR addresses specifically the need to terminate parsing early.

I don't know if there is such a need, and if this is the best approach. Maybe we can use a single parse() function and report errors via function like state.report_error() (which can either stop parsing or collect multiple errors, depending on the settings).

begleynk commented 1 year ago

That use-case is already supported with parse() function, see code below. You can accumulate errors as metadata in AST, then collect them all afterwards.

Makes perfect sense 👍 I hadn't thought about using a NodeExt - thanks for the example. This approach works perfectly for my use case.

I don't know if there is such a need, and if this is the best approach.

Yeah I'm not sure about the best option myself either. Maybe someone has that specific use case and they could comment here? Apologies if my comment ended up causing extra work! Should have been more explicit originally.