mirror / wget

Wget Git mirror
GNU General Public License v3.0
387 stars 131 forks source link

Paths are truncated to early when `pathconf` is available #26

Open Flamefire opened 5 months ago

Flamefire commented 5 months ago

I'm doing a wget --mirror operation on a fairly long URL (including the host folder ~ 242 chars) into another folder via --outdir adding another couple of chars.

This runs into the length limitation at https://github.com/mirror/wget/blob/9a35fe609c87c558153cff80fef7dea809b3cf63/src/url.c#L1523

However I think that limitation is to aggressive/sensitive: It takes the entire quoted path (i.e. at least the 242 chars) and compares it against pathconf(..., _PC_NAME_MAX) when that is available. See https://github.com/mirror/wget/blob/9a35fe609c87c558153cff80fef7dea809b3cf63/src/utils.c#L2665-L2671

Note that the _PC_NAME_MAX returns the maximum length for a filename while PATH_MAX is the maximum length of a path. The former is 255 while the latter is 4096 on "usual" Linux systems. So the 2 code paths are not nearly identical!

Given that the "chomp buffer" size (19) is additionally subtracted any output path is truncated at 255-19=236 chars which isn't enough for use-cases such as mine mirroring a larger hierarchy of folders (files in depth of 10 folders)