lycheeverse / lychee

⚡ Fast, async, stream-based link checker written in Rust. Finds broken URLs and mail addresses inside Markdown, HTML, reStructuredText, websites and more!
https://lychee.cli.rs
Apache License 2.0
2k stars 119 forks source link

ssl errors sometimes erroneously 'ok' #966

Open khemarato opened 1 year ago

khemarato commented 1 year ago

urls.txt

https://expired.badssl.com
https://wrong.host.badssl.com
https://self-signed.badssl.com
https://untrusted-root.badssl.com
https://revoked.badssl.com
https://pinning-test.badssl.com
https://no-common-name.badssl.com
https://no-subject.badssl.com
https://incomplete-chain.badssl.com
https://client-cert-missing.badssl.com
https://no-sct.badssl.com
https://captive-portal.badssl.com
https://mitm-software.badssl.com
https://invalid-expected-sct.badssl.com
https://www.palitext.com

Command: echo 'https://www.palitext.com/urldne' | lychee - || lychee urls.txt

Version: 0.11.1

Expected Result: All fail

Actual Result: pinning-test, no-sct, and final palitext.com homepage all pass

Note: fwiw, curl also erroneously accepts the no-sct and pinning-test URLs, but doesn't get confused by the 404 -> https error that seems to be tripping up lychee... 🤷

khemarato commented 1 year ago

It seems that if you just test https://www.palitext.com it fails as expected. But if you first test a 404 page and then test the homepage, it's erroneously ok 🤷

mre commented 1 year ago

LOL, that's very weird in indeed. Can anybody test this and confirm?

khemarato commented 1 year ago

@mre edited with what seems to be a more reliable reproduction

mre commented 1 year ago

I tested again with lychee 0.12.0 and here's what I get:

 echo 'https://www.palitext.com/urldne' | lychee - || lychee --verbose urls.txt                                                                                           
Issues found in 1 input. Find details below.

[stdin]:
✗ [404] https://www.palitext.com/urldne | Failed: Network error: Not Found

🔍 1 Total ✅ 0 OK 🚫 1 Error (HTTP:1)
✗ [ERR] https://no-subject.badssl.com/ | Failed: Network error: The certificate was not trusted.
✗ [ERR] https://no-sct.badssl.com/ | Failed: Network error: The certificate was not trusted.
✗ [ERR] https://incomplete-chain.badssl.com/ | Failed: Network error: The certificate was not trusted.
✗ [ERR] https://invalid-expected-sct.badssl.com/ | Failed: Network error: The certificate was not trusted.
✗ [ERR] https://no-common-name.badssl.com/ | Failed: Network error: The certificate was not trusted.
✗ [ERR] https://captive-portal.badssl.com/ | Failed: Network error: The certificate was not trusted.
✗ [ERR] https://expired.badssl.com/ | Failed: Network error: The certificate was not trusted.
✗ [ERR] https://self-signed.badssl.com/ | Failed: Network error: The certificate was not trusted.
✗ [ERR] https://revoked.badssl.com/ | Failed: Network error: The certificate was not trusted.
✗ [ERR] https://mitm-software.badssl.com/ | Failed: Network error: The certificate was not trusted.
✗ [ERR] https://untrusted-root.badssl.com/ | Failed: Network error: The certificate was not trusted.
✔ [200] https://www.palitext.com/
✗ [ERR] https://wrong.host.badssl.com/ | Failed: Network error: The certificate was not trusted.
✗ [400] https://client-cert-missing.badssl.com/ | Failed: Network error: Bad Request
✔ [200] https://pinning-test.badssl.com/

Issues found in 1 input. Find details below.

[urls.txt]:
...

🔍 15 Total ✅ 2 OK 🚫 13 Errors (HTTP:13)

So, looks like we're down to just two:

✔ [200] https://www.palitext.com/
✔ [200] https://pinning-test.badssl.com/

Opening https://www.palitext.com/ in a browser gives me a 200, so that looks correct to me. Was there a change?

I am not sure how to fix https://pinning-test.badssl.com/. Is that something that needs to be handled upstream? E.g. with boringssl or openssl or on the OS level?

khemarato commented 1 year ago

Opening https://www.palitext.com/ in a browser gives me a 200, so that looks correct to me. Was there a change

Yes, they put up a placeholder page since I opened this issue.

Seeing as the pinning test is also broken in curl feel free to close this issue (though you may want to add tests for the other URLs just to ensure there's no regression in the future).