rockdaboot / wget2

The successor of GNU Wget. Contributions preferred at https://gitlab.com/gnuwget/wget2. But accepted here as well 😍
GNU Lesser General Public License v3.0
549 stars 75 forks source link

Redirect treated as parent ascend and not followed with --no-parent #331

Closed JurajMarcin closed 4 weeks ago

JurajMarcin commented 2 months ago

When I try to download all images in a directory from Fedora mirror, the mirror I get redirected to might not share the path with the original URL and is therefore treated as parent and skipped.

./src/wget2_noinstall \
    --recursive \
    --no-directories \
    --no-parent \
    --span-hosts \
    'https://download.fedoraproject.org/pub/fedora/linux/development/rawhide/Cloud/x86_64/images'
[0] Downloading 'https://download.fedoraproject.org/robots.txt' ...
HTTP response 302  [https://download.fedoraproject.org/robots.txt]
Adding URL: https://ftp.fi.muni.cz/pub/linux/centos-stream/robots.txt
URL 'https://ftp.fi.muni.cz/pub/linux/centos-stream/robots.txt' not followed (parent ascending not allowed)
[0] Downloading 'https://download.fedoraproject.org/pub/fedora/linux/development/rawhide/Cloud/x86_64/images' ...
HTTP response 302  [https://download.fedoraproject.org/pub/fedora/linux/development/rawhide/Cloud/x86_64/images]
Adding URL: https://mirror.karneval.cz/pub/linux/fedora/linux/development/rawhide/Cloud/x86_64/images/
URL 'https://mirror.karneval.cz/pub/linux/fedora/linux/development/rawhide/Cloud/x86_64/images/' not followed (parent ascending not allowed)

I would expect the redirected URL is treated same as the URL directly in the command line and files it contains are downloaded.

rockdaboot commented 2 months ago

Make sense, thanks for the short reproducer.