evolvingweb / sitediff

SiteDiff makes it easy to see differences between two versions of a website.
http://sitediff.io
GNU General Public License v2.0
227 stars 48 forks source link

Paths with trailing slashes always have the trailing slash removed #154

Open Mark-Hetherington opened 1 year ago

Mark-Hetherington commented 1 year ago

We are working with a site that has a number of index pages, by category. It uses URLS of the form /content//, and uses the trailing slash to differentiate between other filters and a category filter. As such, removing the trailing slash results in a URL that is not on the site.

This is done in uriwrapper.rb:194

I understand the desire to canonicalise the URLs, however it might be useful if it were optional for some sites.

kirk-brown-ew commented 1 year ago

SiteDiff currently doesn't support this type of URL.

Currently, SiteDiff creates directories and files based off of what it sees in paths. When it sees /content/ and /content//, it creates the same file.

We are investigating solutions for this issue.