lycheeverse / lychee

⚡ Fast, async, stream-based link checker written in Rust. Finds broken URLs and mail addresses inside Markdown, HTML, reStructuredText, websites and more!
https://lychee.cli.rs
Apache License 2.0
2.23k stars 136 forks source link

Per-line excludes #1492

Closed wiktor-k closed 3 months ago

wiktor-k commented 3 months ago

Hi,

I've just started using lychee and it's fantastic!

I've got one suggestion, sometimes my code contains URL fragments such as:

    let update_link = format!(
        "https://raw.githubusercontent.com/Nitrokey/nethsm-sdk-py/main/tests/{}",
        file_name
    );

And the "not-quite-really-valid-URL" in there makes lychee sad.

I could add raw.githubusercontent.com to exclusion list but that'd unnecessarily exclude other links I want checked.

Most linters allow for granular exclusions, up to the level of the line. I've searched docs, issues and discussions but didn't find anything so my suggestion is to support a marker that'd make the URL in that particular line ignored.

An example:

    let update_link = format!(
        "https://raw.githubusercontent.com/Nitrokey/nethsm-sdk-py/main/tests/{}", // lychee:ignore
        file_name
    );

Thanks for your time!

(If this suggestion is not a good idea, feel free to close the ticket)

mre commented 3 months ago

Are you testing Rust files with lychee? 😉

You could exclude links like so

lychee --exclude 'https://raw.githubusercontent.com/Nitrokey/nethsm-sdk-py/main/tests/'

it supports regex, so you might also try

lychee --exclude '\{\}' ...

but that won't work, because the strings get encoded:

https://raw.githubusercontent.com/Nitrokey/nethsm-sdk-py/main/tests/%7B%7D

but this would work

lychee --exclude '%7B'

Is that enough in your case?

Markers were discussed here, but it's a slippery-slope. The gist is that parsing is hard.

wiktor-k commented 3 months ago

Are you testing Rust files with lychee? 😉

Yes, I've got quite extensive documentation in these doc-comments :sweat_smile:

Is that enough in your case?

Well, I ended-up marking these URLs in the config file, which, I guess, is good enough: https://gitlab.archlinux.org/archlinux/signstar/-/merge_requests/48/diffs#9a1a8f7c952ee697ad8a1b56a05a26b78db09c19_0_5

Markers were discussed https://github.com/lycheeverse/lychee/issues/1444, but it's a slippery-slope. The gist is that parsing is hard.

Agreed. I'll close this ticket even though it's not resolved as there's a link to prior discussion.

Thanks! :wave:

mre commented 3 months ago

Nice. One final addition: you can also put exclusions into a dedicated .lycheeignore file. I see that you already have a lychee.toml, so you're all set, but just in case someone finds this and wants to quickly ignore a few links permanently, that's another way to do that. I realized the documentation around that functionality was lacking, so I updated it.

wiktor-k commented 2 months ago

.lycheeignore is an excellent idea since the rest of the config won't change as often as this one. Thanks! 🙇‍♂️