scragg0x / realms-wiki

Git based wiki inspired by Gollum
http://realms.io
GNU General Public License v2.0
831 stars 90 forks source link

File name extension #19

Open bdillahu opened 9 years ago

bdillahu commented 9 years ago

This may be related to the Relative Links issue #18 - Now, when I have a link like:

[test](test.md)

And click on it, it creates a file names

testmd.md

Shouldn't strip the dot in my opinion.

Thanks, Bruce

scragg0x commented 9 years ago
def to_canonical(s):
    """
    Double space -> single dash
    Double dash -> single dash
    Remove all non alphanumeric and dash
    Limit to first 64 chars
    """
    s = s.encode('ascii', 'ignore')
    s = str(s)
    s = re.sub(r"\s\s*", "-", s)
    s = re.sub(r"\-\-+", "-", s)
    s = re.sub(r"[^a-zA-Z0-9\-]", "", s)
    s = s[:64]
    return s

This is by design. The app appends .md before staging the file. The above function converts the name, stripping everything but alphanum and dashes. The idea was to create something that is safe for the file system and URL while having them match exactly, except for the file extension. That is the reason I have /_edit, /_create endpoints. Underscores aren't allowed for wiki pages so it can't be easily confused with one. If I allow "." for example, a user could make a favicon.ico or robots.txt endpoints. People tried to do that on realms.io immediately. :) I could black list those but there could be something else I'm forgetting. I can allow certain characters without really breaking anything to existing wikis. Removing them is another story. So I started out safe.

scragg0x commented 9 years ago

BTW, I believe I based this function on Gollum, but that was a long time ago. I will take another look to see if they altered it.

bdillahu commented 9 years ago

Yeah, see your issue with security... my use case was strictly private, so wasn't worrying about it.

loleg commented 9 years ago

This breaks compatibility with other Markdown renderers, such as Github and Bitbucket, where relative links conserve the extension. I like your format nonetheless, so my suggestion would be to strip out .md and redirect to the canonical form. There unfortunately does not seem to be any standard for this in the Markdown community.

scragg0x commented 9 years ago

@loleg You mention Markdown renderers when this isn't about that. The link you posted is referring to links to files in github code repos. I think a more fair comparison would be how github's wiki (gollum) behaves when creating canonical names.

Your pull request took out the line that removed double dashes and duplicated \s\s to -.

https://github.com/loleg/realms-wiki/blob/master/realms/lib/util.py#L94-L95

As for my hesitation to merge...

I am looking forward to a time when realms will support more than Markdown. If users start linking to wiki pages with extensions it might cause for messy links. For example, someone's Home.md could one day point to a page that is actually rst format. In any case, the application can still figure out what page the link is referring to and render it correctly based on the file extension.

Again, I will play with gollum and see how it behaves since I'm sure they spent many more hours thinking about this than I have.

loleg commented 9 years ago

@scragg0x oops! Line restored in b8cd97c9.

I see your point, but I still think it should be a user choice. If users have broken links, they can fix them. They can do a global search and replace. Here is Gollum's doc on page links.

This is not a showstopper issue for me. What are your thoughts @bdillahu ?

bdillahu commented 9 years ago

My personal use case would prefer using the extension if it appears, and appending logically a *.md extension (or whatever is set in some manner in the future) if no extension is given. To me that would handle the use cases of needing to explicitly refer to a particular extension, etc.

But I know there are some security issues involved also.

I'll certainly bow to wiser minds than mine :-)