postlight / parser

📜 Extract meaningful content from the chaos of a web page
https://reader.postlight.com
Apache License 2.0
5.35k stars 436 forks source link

Postlight Reader | URLs broken #738

Open TheOnlyWayUp opened 1 year ago

TheOnlyWayUp commented 1 year ago

Expected Behavior

After I use the Postlight Reader Extension, I expect all links in the cleaned article to be functional.

Current Behavior

In the cleaned document (presented by the Postlight Reader extension), Links go from /path (relative url) to chrome-extension://oknpjjbmpnndlpmnhmekjpocelpnlfdi/path.

Steps to Reproduce

Possible Solution

While the article is being cleaned, retrieve the domain name and suffix all relative urls (href="/path") with the domain (href="domain.com/path").


Cool stuff, thank you :)

Rilomilo commented 1 year ago

Yes, I have the same issue, relative url breaks in many other contexts too. Absolute url is needed.