lycheeverse / lychee

⚡ Fast, async, stream-based link checker written in Rust. Finds broken URLs and mail addresses inside Markdown, HTML, reStructuredText, websites and more!
https://lychee.cli.rs
Apache License 2.0
2.22k stars 134 forks source link

Support underscores in Markdown URLs #1555

Closed mre closed 2 weeks ago

mre commented 2 weeks ago

By default, pulldown_cmark produces multiple text events if it detects an _. That is for performance reasons. More information [here](See https://github.com/pulldown-cmark/pulldown-cmark/issues/146).

The correct way to handle this is by using the TextMergeStream helper-struct, which provides an iterator that merges consecutive Event::Text events into one.

With this, we can correctly parse links like http://example.com/_/foo.

Fixes #1529