Closed thomasgloe closed 2 years ago
As a workaround it is possible to split a string in spans:
use linkify::LinkFinder;
let input = "Look, no scheme: example.org/foo Email: email@foo.com";
let mut finder = LinkFinder::new();
finder.url_must_have_scheme(false);
let spans: Vec<_> = finder.spans(input).collect();
for span in spans {
let links: Vec<_> = finder.links(span.as_str()).collect();
for link in links {
println!(" - link: {}", link.as_str());
}
}
Not sure if it helps but can you also check against https://github.com/robinst/linkify/pull/43?
Mh, I've switched to a simple RegEx approach as I've observed additional issues in my test case. Urls in my data are not too complicated. So even the workaround above did not fix all issues.
Yep, #43 fixes that problem too, I'll add that as a test case.
@thomasgloe can you provide the additional problematic cases that you've found here?
Code example:
use linkify::{Link, LinkFinder};
fn find_links(input: &str) -> Vec<Link> {
let mut finder = LinkFinder::new();
finder.url_must_have_scheme(false);
let mut links = Vec::new();
let spans: Vec<_> = finder.spans(input).collect();
for span in spans {
// added second finder, to test if this makes any difference - it does not.
let mut finder2 = LinkFinder::new();
finder2.url_must_have_scheme(false);
let mut tlinks: Vec<_> = finder2.links(span.as_str()).collect();
links.append(&mut tlinks);
}
links
}
fn main() {
// multiline input string
let input = "Web:
www.foobar.co
E-Mail:
bar@foobar.co (bla bla bla)";
let links = find_links(input);
for link in links {
println!(" - link: {}", link.as_str());
}
}
results in:
- link: Web:
www.foobar.co
- link: bar@foobar.co
But I would expect:
- link: www.foobar.co
- link: bar@foobar.co
Indeed, I've checked with
linkify = { git = "https://github.com/robinst/linkify", branch = "check-domains" }
and the problematic case above seems to work.
Good to hear! To be honest, the implementation of the url_must_have_scheme(false)
mode had a few problems before. With the branch, its logic is now unified with the others and much cleaner.
I'm releasing the change soon.
Alright, released as 0.9.0 🎉: https://github.com/robinst/linkify/blob/main/CHANGELOG.md#090---2022-07-11
Hi,
I tested the example from the docs and it works great:
However, when the input string is changed to
let input = "Look, no scheme: example.org/foo email@foo.com";
The urlexample.org/foo
is not detected anymore. The same applies to the demo website https://robinst.github.io/linkify/ with the input string.Is this the expected outcome or a bug? Is there any additional switch to detect the url even if there is an email in a string?
Related to https://github.com/robinst/linkify/pull/8