Debian / wcurl

a simple wrapper around curl to easily download files - MIRROR of https://salsa.debian.org/debian/wcurl
https://samueloph.dev/blog/announcing-wcurl-a-curl-wrapper-to-download-files/
Other
97 stars 3 forks source link

Filenames are not percent-decoded #10

Open ryandesign opened 4 days ago

ryandesign commented 4 days ago

wget 1.24.5 percent-decodes filenames:

% wget https://packages.macports.org/itstool/itstool-2.0.7_2%2Bpython312.any_any.noarch.tbz2
--2024-07-04 13:35:28--  https://packages.macports.org/itstool/itstool-2.0.7_2%2Bpython312.any_any.noarch.tbz2
Resolving packages.macports.org (packages.macports.org)... 146.75.106.132
Connecting to packages.macports.org (packages.macports.org)|146.75.106.132|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 19316 (19K) [application/x-bzip2]
Saving to: ‘itstool-2.0.7_2+python312.any_any.noarch.tbz2’

itstool-2.0.7_2+python312.any_any.noarch.tbz2                       100%[==================================================================================================================================================================>]  18.86K  --.-KB/s    in 0.001s

2024-07-04 13:35:28 (33.7 MB/s) - ‘itstool-2.0.7_2+python312.any_any.noarch.tbz2’ saved [19316/19316]

% ls -l itstool*
-rw-r--r--  1 rschmidt  wheel  19316 Jan 19 09:52 itstool-2.0.7_2+python312.any_any.noarch.tbz2
% rm itstool*

wcurl 2024-07-02 doesn't:

% wcurl https://packages.macports.org/itstool/itstool-2.0.7_2%2Bpython312.any_any.noarch.tbz2
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 19316  100 19316    0     0   287k      0 --:--:-- --:--:-- --:--:--  285k
% ls -l itstool*
-rw-r--r--  1 rschmidt  wheel  19316 Jan 19 09:52 itstool-2.0.7_2%2Bpython312.any_any.noarch.tbz2
% rm itstool*
ryandesign commented 4 days ago

I am working on fixing this (and #4) by using trurl to parse the url (and, if the path is empty after stripping up to the last slash, using the name index.html) and then have wcurl specify the filename with --output instead of letting curl do it with --remote-name, but I'm having to rewrite the way that you collect the curl arguments in exec_curl because the current way of putting everything into a single string does not accommodate quoting of special characters. I'd normally use a bash array but if you're trying to maintain POSIX sh compatibility I'll need to be more inventive. POSIX sh only has one array—$@—which is in use in exec_curl to hold the urls so I'll either try to dual-purpose $@ or find a different way to store the urls.

samueloph commented 4 days ago

@ryandesign maybe we should reconsider doing it in bash... I'll speak to sergiodj about it

ryandesign commented 4 days ago

I don't quite see the reason why $URLS is being moved into $@ and am attempting to just use $URLS directly in the loop and leave $@ free for the curl arguments. I'm not opposed to the POSIX sh challenge.

ryandesign commented 4 days ago

I think I have it working and will submit it in a few hours.

ryandesign commented 4 days ago

This works for me but please test:

https://salsa.debian.org/debian/wcurl/-/merge_requests/4

Note there is a new dependency on trurl.