raviqqe / liche

Fast Link Checker for Markdown and HTML in Go
MIT License
125 stars 31 forks source link

Regex not excluding internal links #43

Open JackMcKew opened 4 years ago

JackMcKew commented 4 years ago

Hey!

Thank you for this amazing tool!

I've tried using it to do the link verification of a static site generator package (packit.dev), but I want to exclude all internal links (eg, ones beginning with "/" or "{"). Here's the example: https://regex101.com/r/De199r/1

I've tried the regex in all sorts of ways, but it doesn't ever seem to exclude them, from my understanding this might be due to the time of evaluation (it checks the internal before it checks if it should exclude)

You can see all my attempts to get this working in this repo https://github.com/JackMcKew/packit.dev and subsequent github actions fired with the errors https://github.com/JackMcKew/packit.dev/actions

Maybe I'm making a silly error, but I'd love to be able to use this package for this purpose

Thank you!

MichaIng commented 4 years ago

Found the same. I guess it is not the extended regex set that is supported, e.g. (pattern1|pattern2) does not work, and ?/+ meaning would then also not work. So like you use grep or sed without -E flag (on distros where it's not default). I'm just going to verify this. EDIT: Nope, even simple words have no effect, like: -x phpbb

    ERROR   phpbb/viewtopic.php?f=8&t=5#p43
        Stat site/phpbb/viewtopic.php: no such file or directory
    ERROR   phpbb/viewtopic.php?f=8&t=5#p45
        Stat site/phpbb/viewtopic.php: no such file or directory
...

It looks like local/internal links are generally not processed for exclusion somehow, btw regardless if relative or absolute paths.