cryogen-project / cryogen-core

Cryogen's core
Eclipse Public License 1.0
69 stars 62 forks source link

Fix 142: URL encode tag links #143

Closed gamlerhart closed 3 years ago

gamlerhart commented 4 years ago

Tags which contain special characters like a # or ? produce a link which doesn't reach the tag output page. Fix: URL encode the tag in the links

gamlerhart commented 4 years ago
gamlerhart commented 4 years ago

Fixed the space encoding, so that a space in a tag is encoded as %20.

holyjak commented 3 years ago

~Mostly looks good to my but the tag test #2 does not work. The URL becomes test followed by anchor reference #2/. I guess # would need to be encoded separately. But your PR already is an improvement over the current situation so I am all for merging it.~

~But I would prefer if you removed the c# test b/c it is actually wrong and will produce a link that does not work due to the special status of #.~

Looks good to me. (Ignore the comment above, I was testing the wrong code 😅)

bombaywalla commented 3 years ago

Out of curiosity, why use %20 for the space instead of the + that is used by the standard Java URLEncode function?

bombaywalla commented 3 years ago

I wonder whether ALL URLs should be encoded? Not just tags URLs.

holyjak commented 3 years ago

@gamlerhart do you have opportunity to answer 👆?

@bombaywalla What other URLs do you mean? The author does control the url through the file name so I would trust them to pick names that work in the URL. Are there any others then these and the tags?

gamlerhart commented 3 years ago

Why using the %20 instead of '+'? Because some webservers pass through the '+' to the file system. While %20 is explicitly a encoded space.

About encoding all URLs: I didn't encounter any issue. Most urls are created from the directory / file names. And is use conservative names, so I never encountered a issue. I do not want overshoot an 'fix' something which I don't know if there is a issue =)

bombaywalla commented 3 years ago

Thanks for the explanation about not using '+'. Though, I'd say that the web servers that pass along the '+' are not following the standard. But that is the reality we have to deal with ....

As to encoding all URLS, I was just considering that for consistency. For example, I can easily see people creating files whose names contain a space. But perhaps we cross this bridge when we get to it.

Thanks for writing tests for your PR. Much appreciated.

I have two minor requests:

  1. There is no newline at the end of test/cryogen_core/compiler_test.clj. Would you please add one.
  2. Would you please add a (testing ...) to your deftest that describes the tag URL encoding you are testing.

I am okay with merging this PR once the above changes are made.

holyjak commented 3 years ago

I agree with Dorab here, it would be helpful to document in the code the need for the extra space encoding so that when we revisit it in a few years, we won't wonder why it is there... 🙏

gamlerhart commented 3 years ago

I've amended the commit.