adobe / helix-importer

Foundation tools for importing website content into that can be consumed in an Helix project.
Apache License 2.0
9 stars 18 forks source link

Image urls adjustment ignore absolute urls to same host #295

Closed kptdobe closed 9 months ago

kptdobe commented 9 months ago

adjustImageUrls makes sure that image urls points to the same host than the current url being processed. For now, it only cares about relative urls (and make them absolute). But there is another case to consider: image absolute urls referring to the host on which the import is being processed in the context of an import running on localhost (sic!).

Import is running, page https://www.sample.com/page.html is being imported but in reality http://localhost:3001/page.html is being processed. During the import, "current" host is http://localhost:3001 thus if an image originally used an absolute url, it does not get re-written and import might fail to download the image (csp...).

We then need to make sure those image urls are adjusted too.