lycheeverse / lychee

⚡ Fast, async, stream-based link checker written in Rust. Finds broken URLs and mail addresses inside Markdown, HTML, reStructuredText, websites and more!
https://lychee.cli.rs
Apache License 2.0
1.98k stars 119 forks source link

Ignore links in prefix attributes #1209

Open mre opened 1 year ago

mre commented 1 year ago

E.g.

<html lang="en-EN" prefix="og: https://ogp.me/ns#">
xlai89 commented 4 months ago

Hi @mre ,

I had a very good first experience with lychee. Thank you for developing the tool! I am also learning to program more with rust and to contribute more in the OS community. May I try working on this issue?

Some guidance would be really helpful though. Is it a valid way to extend the element/attribute combinations in html5gum.rs and html5ever.rs and then add "html" into the function is_verbatim_elem? Would this impact other use cases, where "html" should not be excluded, when include_verbatim is false?

mre commented 3 months ago

Hey @xlai89, thanks for considering to work on this! That would be great. 👍

The changes would be quite similar to https://github.com/lycheeverse/lychee/pull/1187. Perhaps you want to tackle that one first? You can try to follow the comments and see if you can help out. It's of course totally fine to also tackle this issue first. You'll probably follow the same structure anyway, so the order shouldn't matter that much.

xlai89 commented 3 months ago

Thanks for the hint. I'll try to understand and maybe work on PR #1187 first.