Closed rgaudin closed 5 months ago
Fix 1 if possible in good conditions, not because this is wrong, but because this can be a source of trouble without bringing clear advantage.
AND
Fix 2 because Kiwix should work properly anyway and macOS15 will be released within a few month (so we can just bump the version of Swift... but maybe better in 3.5.0.
Fix 1 is not easy in good conditions in warc2zim. It would means removing the trailing /
in all entries which have one (quite possible) but it means we have a risk of conflict, probably rare but mostly impossible to solve. It is perfectly legit for a website to have two different resources at two different URLs differing only by a trailing /
(e.g. www.example.com/myresource
and www.example.com/myresource/
).
Since ZIM entries are just a "string", mostly mapped to a URL, and URLs ending with a slash are legit, I don't get why we would impose more constraints that necessary on ZIM entries.
Plus removing a trailing slash in the path of a ZIM entry corresponding to an HTML document means that all relative URLs in the document have to be adapted because we are not anymore in a "folder" but an a "file".
@BPerlakiH The we have the point 2 left!
Why not point 3 for 3.4, and point 2 for 3.5? (I mean, the code of point 3 seems a bit custom but rather plain simple to write)
Why not point 3 for 3.4, and point 2 for 3.5? (I mean, the code of point 3 seems a bit custom but rather plain simple to write)
š
Why not point 3 for 3.4, and point 2 for 3.5? (I mean, the code of point 3 seems a bit custom but rather plain simple to write)
š
Sounds a bit like a hack to me, which will be later rolled back. this is why I prefer #2 straight.
There is nothing in the ZIM spec that prevents using entries ending in
/
and thus we ended up in #779 with a ZIM (quartierschinois) that has a/
-ending main page.Nothing prevents URLs to end in slash neither.
One popular convention though is that the path is composed of segments separated by
/
and the last part is the document. As such, a/
-ending path can be considered equivalent to the same path without that trailing slash. This convention is still implemented in some web servers, browsers, and libraries.A swift
URL
has apath
property that we use all over the code base to extract the ZIM entry path.In the swift version we use,
URL.path
removes that trailing slash, breaking lookup of ZIM entries with/
-ending path.In later version of swift (macOS 13+), Apple added an
URL.path()
that keeps the trailing slash and works as intended with those entries.What should we do?
/
-ending ZIM entries incorrect and fix ZIM files to not use them?URL.path()
and bump our support from macOS12+ to macOS13+ and iOS15+ to iOS16+ ?URL.hasDirectoryPath
tells us if URL ends in/
) ?I'm in favor of the latter, until we drop those versions support at the end of the year.