markdown-it-rust / markdown-it

markdown-it js library rewritten in rust
Other
79 stars 9 forks source link

Some failures of markdown-it JS test fixtures #23

Open chrisjsewell opened 1 year ago

chrisjsewell commented 1 year ago

In https://github.com/chrisjsewell/markdown-it-pyrs/tree/main/tests/fixtures, I have essentially taken all the test fixtures that were present in markdown-it and applied them to this implementation.

I am currently skipping a few tests failing in commonmark_extras.md:

A few in normalize.md:

and all of linkify.md, since the current plugin no longer parses "bare" URLs, e.g. it parses http://example.com but not www.example.com

Are these known issues/changes?

Note, ideally with this linkify, or perhaps in a separate plugin, it would obviously be desirable to create a GFM compliant rule for: https://github.github.com/gfm/#autolinks-extension-

chrisjsewell commented 1 year ago

I guess, it would be great to have a mechanism in Rust to directly test these files. There are crates like https://crates.io/crates/testing, https://crates.io/crates/rstest, and https://crates.io/crates/insta, but I haven't found one that completely meets the criteria yet 🤔

rlidwka commented 1 year ago

and all of linkify.md, since the current plugin no longer parses "bare" URLs, e.g. it parses http://example.com/ but not www.example.com

I use completely different linkify implementation, not related to JS in any way. Don't expect feature parity there.

You can open any issues you find at https://github.com/robinst/linkify

For bare urls, you can read this discussion: https://github.com/robinst/linkify/issues/7

rlidwka commented 1 year ago

The rest are probably bugs, which I'll have to look into shortly.

rlidwka commented 1 year ago

The rest are probably bugs, which I'll have to look into shortly.

yep, that's three bugs out of the way: https://github.com/rlidwka/markdown-it.rs/commit/c2919dd0b123f3aeb9264a6a6ec8a0d01cfbe19f, https://github.com/rlidwka/markdown-it.rs/commit/1773eee36016dd947eaabc919c12cbabb79568fd, https://github.com/rlidwka/markdown-it.rs/commit/aa4bc6559201064650e9fb314d332642507461d4

chrisjsewell commented 1 year ago

and all of linkify.md, since the current plugin no longer parses "bare" URLs, e.g. it parses http://example.com/ but not www.example.com

I use completely different linkify implementation, not related to JS in any way. Don't expect feature parity there. You can open any issues you find at robinst/linkify

I'd note this not a limitation of linkify, which does parse these bare URLs: it will parse emails like me@example.com by default, then requires finder.url_must_have_scheme(false) to parse e.g. www.example.com

The main limitation is in the plugin implementation here, which only searches for : https://github.com/rlidwka/markdown-it.rs/blob/c2919dd0b123f3aeb9264a6a6ec8a0d01cfbe19f/src/plugins/extra/linkify.rs#L80 so obviously will not identify any scheme-less URLs

I've now implemented https://crates.io/crates/markdown-it-autolink, for strict GFM autolink parsing. The only known issue is that currently it cannot parse emails with _ before the @ e.g. ab@example.com, this is because it back-tracks from the @, but the `` has already been parsed as an emphasis token. Not sure how yet to fix this 🤷‍♂️ (https://github.com/chrisjsewell/markdown-it-plugins.rs/issues/13)

rlidwka commented 1 year ago

The main limitation is in the plugin implementation here, which only searches for :

ah okay, in original markdown-it we have two rules (inline rule that catches copy-pasted urls and core rule that catches everything), but here I've only implemented first one for simplicity

so the way to fix it is to re-scan all text later after inline parser is done, see: https://github.com/markdown-it/markdown-it/blob/master/lib/rules_core/linkify.js

compare it to rule I've implemented here (and you probably copied): https://github.com/markdown-it/markdown-it/blob/master/lib/rules_inline/linkify.js