Open wtf opened 8 years ago
This could be valuable but we have to rethink how the blocklist is handled.
Right know we have flat lists that do not describe what kind of check has to be done on the hash;
This causes that on all the hashes we have to apply all the checks and what you are proposing here mixed with the current implementation would require to do O(n)
hashed with n
the length of the url.
the possible solution to this is described here: https://github.com/globaleaks/Tor2web/issues/280#issuecomment-174238795
Since we already split the incoming URL into parent URLs and compare those with our block list, removing trailing slashes would double the amount of work, but it would still be much better than O(n).
Currently, blocklisting
blahblahblahblah.onion/a/
blocksblahblahblahblah.onion/a/b/c/
but notblahblahblahblah.onion/a/b/c
. At least some websites default to no-trailing-slashes and may also redirect there from trailing slashed URLs.