lycheeverse / lychee

⚡ Fast, async, stream-based link checker written in Rust. Finds broken URLs and mail addresses inside Markdown, HTML, reStructuredText, websites and more!
https://lychee.cli.rs
Apache License 2.0
1.9k stars 116 forks source link

Proxy / PAC file support #869

Open soredake opened 1 year ago

soredake commented 1 year ago

It will be nice if lychee can have support for proxy and PAC file, i have sites that are blocked in my country.

mre commented 1 year ago

I'm guessing you mean files as described here? Can you post an example so we can have a look? I'm wondering if it already works if you pass it to lychee as an input e.g. lychee foo.pac or something.

soredake commented 1 year ago

To clarify, i don't want to check PAC file, i want to use it with lychee like lychee --proxy-file my-pac-file.pac so it will use my configured proxies when checking url's.

mre commented 1 year ago

Ooh, of course, sorry. That makes a lot of sense. We'd need to find a Rust crate, which parses the format and then pass the parameters to reqwest. I think they have support for that. (Can't check right now.)

mre commented 1 year ago

I had a quick look and reqwest supports it just fine: https://docs.rs/reqwest/latest/reqwest/struct.Proxy.html Now we only need a parser for the format.

sanmai-NL commented 7 months ago

@mre It would be helpful if proxying support were separated from PAC file support.

mre commented 6 months ago

The question is how that would look like. Any preferred syntax? Something like --proxy 127.0.0.1:8080 for example?

sanmai-NL commented 6 months ago

Supporting the proxying environment variables similar to curl (and other tools) would be the most convenient and standard starting point.

mre commented 6 months ago

At some point I really need to look at the curl docs. So many great features.

sanmai-NL commented 6 months ago

Yeah, and this part of it is pretty conventional across tools.

mre commented 6 months ago

We'll stick to that then.

sanmai-NL commented 6 months ago

https://curl.se/docs/manpage.html has an Environment section that contains the relevant information. SOCKS5 proxying is the most relevant feature as it can support TLS better.

mre commented 6 months ago

I don't know much about socks. Was thinking of

 lychee --proxy http://proxy.example https://example.com

Similar to the example they show in the docs (thanks for sharing!). Would a socks proxy be a superset of an HTTP proxy, then? Is it a strict superset?

sanmai-NL commented 6 months ago

Do you mean a complement? It's a (strict) superset yes.