gohugoio / hugo

The world’s fastest framework for building websites.
https://gohugo.io
Apache License 2.0
76.27k stars 7.55k forks source link

Terms (Tags/Categories) output incorrect/broken directory names #4519

Closed crgeary closed 6 years ago

crgeary commented 6 years ago

I hit a problem where one of my tags contains the hash # character, and Hugo is generating a directory with the # in it (which of course would not work on the web).

Tag: "#hi" Folder created: /tags/#hi/


I then ran into another problem. I created a tag called no (unquoted), and this seems to create a folder called false which I am really confused about.

Tag: no Folder created: /tags/false/


Hugo version: v0.37.1 Mac OS: v10.11.6

kaushalmodi commented 6 years ago

You cannot have tags contain "#".. rename your tags, and you need to double-quote the "no".. YAML spec. Further discussion better belongs in https://discourse.gohugo.io/ as the issues you raised have already been addressed there in older posts (as recent as last week!).

paulcmal commented 6 years ago

I found this discussion from a while back, but nothing recent. Care to point us in the right direction @kaushalmodi?

Also, if I understand the code correctly, spaces are replaced by dashes, then UnicodeSanitize is called to remove unwanted characters. However, # is not an unwanted character in URLs, except in the generated slugs. So that means we need to strip the # before?

I found this issue, which inspired me to do a little messing around. It appears # is perfectly handled in titles now, but not in URLs generated from filenames (foo#bar.md or foo#bar/index.md). The files in public are put in the foo#bar folder as expected, but these are unreachable because a such name is against web conventions, # being reserved for fragment identifiers.

Should this be adressed (i.e. # stripped from generated URLs) or documented as unsupported?

kaushalmodi commented 6 years ago

but nothing recent. Care to point us in the right direction @kaushalmodi?

It was probably this.

these are unreachable because a such name is against web conventions, # being reserved for fragment identifiers.

Exactly! Pounds in tag names will end up in the URL and clash with the fragment identifiers. I haven't tried.. but the browser with probably pick the first found # as the fragment identifier.

shaform commented 6 years ago

This issue might be fixed by #4388 if it gets merged.

crgeary commented 6 years ago

@shaform unfortunately not.. As of v0.44, this problem still exists.

shaform commented 6 years ago

@crgeary Oops, I should have said "this issue might be fixed if the PR gets merged in the future".

stale[bot] commented 6 years ago

This issue has been automatically marked as stale because it has not had recent activity. The resources of the Hugo team are limited, and so we are asking for your help. If this is a bug and you can still reproduce this error on the master branch, please reply with all of the information you have about it in order to keep the issue open. If this is a feature request, and you feel that it is still relevant and valuable, please tell us why. This issue will automatically be closed in the near future if no further activity occurs. Thank you for all your contributions.

shaform commented 6 years ago

@crgeary Now that #4388 is merged, I guess the # issue is fixed? #hi should now output hi I guess. So perhaps this issue could be closed.

github-actions[bot] commented 2 years ago

This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.