lycheeverse / lychee

⚡ Fast, async, stream-based link checker written in Rust. Finds broken URLs and mail addresses inside Markdown, HTML, reStructuredText, websites and more!
https://lychee.cli.rs
Apache License 2.0
2.11k stars 127 forks source link

Issues with relative links #1296

Open mre opened 11 months ago

mre commented 11 months ago

A user was trying out lychee for NLnet and found that version 0.13 misreports many links. E.g.

lychee https://nlnet.nl/thema/OpenDocumentFormat.html

Issues found in 1 input. Find details below.

[https://nlnet.nl/thema/OpenDocumentFormat.html]:
✗ [404] https://nlnet.nl/BinaryAnalysisFund.html | Failed: Network error: Not Found
✗ [404] https://nlnet.nl/Services+Applications.html | Failed: Network error: Not Found
✗ [404] https://nlnet.nl/InternetInfrastructure.html | Failed: Network error: Not Found
✗ [ERR] http://www.viewerjs.org/examples | Failed: Network error: The certificate was not trusted.
✗ [404] https://nlnet.nl/GetEduroam.html | Failed: Network error: Not Found
✗ [404] https://nlnet.nl/Deployability.html | Failed: Network error: Not Found
✗ [404] https://nlnet.nl/Decentralisedsolutions.html | Failed: Network error: Not Found
✗ [404] https://nlnet.nl/Conferences.html | Failed: Network error: Not Found
✗ [404] https://nlnet.nl/NetworkApplications.html | Failed: Network error: Not Found
✗ [404] https://nlnet.nl/Applicationprotocols.html | Failed: Network error: Not Found
✗ [404] https://nlnet.nl/EducationalPrograms.html | Failed: Network error: Not Found
✗ [404] https://nlnet.nl/InternetHardeningFund.html | Failed: Network error: Not Found
✗ [404] https://nlnet.nl/Reportsandstudies.html | Failed: Network error: Not Found
✗ [404] https://nlnet.nl/NGIZeroCore.html | Failed: Network error: Not Found
✗ [404] https://nlnet.nl/NGIZeroDiscovery.html | Failed: Network error: Not Found
✗ [404] https://nlnet.nl/InformationRetrieval.html | Failed: Network error: Not Found
✗ [404] https://nlnet.nl/Verticals+Search.html | Failed: Network error: Not Found
✗ [404] https://nlnet.nl/User-operatedInternetFund.html | Failed: Network error: Not Found
✗ [404] https://nlnet.nl/NGI0Entrust.html | Failed: Network error: Not Found
✗ [404] https://nlnet.nl/Measurement.html | Failed: Network error: Not Found
✗ [404] https://nlnet.nl/Middlewareandidentity.html | Failed: Network error: Not Found
✗ [500] http://translate.org.za/ | Failed: Network error: Internal Server Error
✗ [404] https://nlnet.nl/Networkinfrastructure.html | Failed: Network error: Not Found
✗ [404] https://nlnet.nl/Privacyandsecurity.html | Failed: Network error: Not Found
✗ [404] https://nlnet.nl/CommunityPrograms.html | Failed: Network error: Not Found
✗ [404] https://nlnet.nl/OperatingSystems.html | Failed: Network error: Not Found
✗ [404] https://nlnet.nl/NGIAssure.html | Failed: Network error: Not Found
✗ [404] https://nlnet.nl/DataandAI.html | Failed: Network error: Not Found
✗ [404] https://nlnet.nl/OpenDocumentFormat.html | Failed: Network error: Not Found
✗ [404] https://nlnet.nl/Softwareengineering.html | Failed: Network error: Not Found
✗ [404] https://nlnet.nl/Hardware.html | Failed: Network error: Not Found
✗ [404] https://nlnet.nl/OpenData.html | Failed: Network error: Not Found
✗ [404] https://nlnet.nl/NREN.html | Failed: Network error: Not Found
✗ [404] https://nlnet.nl/NGIZeroPET.html | Failed: Network error: Not Found

🔍 113 Total (in 11s) ✅ 78 OK 🚫 34 Errors 💤 1 Excluded

The errors are all from relative links.

It incorrectly strips away the path. E.g. instead of https://nlnet.nl/thema/BinaryAnalysisFund.html it detects https://nlnet.nl/BinaryAnalysisFund.html

fingolfin commented 7 months ago

Same issue here. I was excited about lychee at first but unfortunately it is ultimately unusable for me for this reason. Unfortunately I don't know any rust so I can't really offer help either :-/.