robinst / linkify

Rust library to find links such as URLs and email addresses in plain text, handling surrounding punctuation correctly
https://robinst.github.io/linkify/
Apache License 2.0
206 stars 12 forks source link

Don't require host for `file` scheme #58

Closed mre closed 1 year ago

mre commented 1 year ago

In 0.9.0, URL parsing became more restrictive to avoid some false positives. While the changes improved robustness, they had the side-effect of breaking a use-case for lychee, which allowed relative file paths as inputs.

For example, this works as expected and checks somefile for broken links:

echo 'file://somefile' | lychee -

However, relative paths don't work anymore:

echo 'file://../somefile' | lychee -

The link (and therefore the input) is no longer detected.

It looks like this is a regression, because linkify now requires a host for file schemes: https://github.com/robinst/linkify/blob/b1d8e3c76e33118844626fe20489912ff51dc0dc/src/url.rs#L137-L142

In the above example, somefile is detected as the "host" (while it is more likely to be a filename in this case), whereas linkify doesn't detect a host in ../somefile.

The fix is straightforward if we are willing to accept file URLs without a host again. (On a more general note, I don't think the concept of a "host" applies to files anyway.)

I made the change and added a test and the fix does not affect any other use-cases, which makes me hopeful that this can be merged.

mre commented 1 year ago

@robinst, how do you feel about this PR? Do you think we can merge this, or do you have any concerns? The failing test seems to be unrelated.

robinst commented 1 year ago

Hmm, shouldn't a relative path (without authority) look like this?:

file:///../somefile

That is currently accepted. Having said that, I don't really care about file URLs that much, so I'm ok with making the change 😅. Let me check the build.

robinst commented 1 year ago

Released in 0.10.0 now: https://github.com/robinst/linkify/blob/main/CHANGELOG.md#0100---2023-06-24