fletcher / MultiMarkdown-4

This project is now deprecated. Please use MultiMarkdown-6 instead!
https://github.com/fletcher/MultiMarkdown-5
Other
306 stars 59 forks source link

Image definitions with spaces in filenames break HTML #93

Closed mn4367 closed 9 years ago

mn4367 commented 9 years ago
multimarkdown.exe --version
MultiMarkdown version 4.6 ...

images.md:

Author: Me
Date: Today
Title: Test

Image with spaces in filename and direct definition:

![With spaces in filename](A Sample Image.png)

Image with spaces in filename by ID:

![With spaces in filename][sample_spaces:fig]

[sample_spaces:fig]: A Sample Image.png
multimarkdown.exe -f -t html -o images.html images.md

images.html:

<!DOCTYPE html>
<html>
<head>
    <meta charset="utf-8"/>
    <meta name="author" content="Me"/>
    <meta name="date" content="Today"/>
    <title>Test</title>
</head>
<body>

<p>Image with spaces in filename and direct definition:</p>

<p>![With spaces in filename][With spaces in filename](A Sample Image.png)</p>

<p>Image with spaces in filename by ID:</p>

<figure>
![With spaces in filename][sample_spaces:fig]

<p>[sample_spaces:fig]: A Sample Image.png</p>

</body>
</html>

Pointing to an image with a direct link does no harm (although the result it looks strange) but a separate ID leads to an unclosed figure tag. Perhaps this is more a case for documentation since a 'fix' is already possible:

![With spaces in filename](A%20Sample%20Image.png)

[sample_spaces:fig]: A%20Sample%20Image.png

But I doubt that this will work when creating LaTeX or ODF.

My proposal is to allow the following:

![With spaces in filename]("A Sample Image.png")

[sample_spaces:fig]: "A Sample Image.png"

Handling of spaces should be done depending on the output format. There are probably other characters which have to be escaped for HTML (and other output formats?). Firefox and Safari are able to display the image without escaping (<img src="A Sample Image.png" alt="With spaces in filename" />), but the file won't pass the W3C HTML validation due to the non-escaped spaces in the src-URL.

fletcher commented 9 years ago

What you refer to as a filename is actually a relative URL. In this case, it happens to be interpreted as a filename when converting to LaTeX, etc. But it's a URL according to Markdown.

If you insist on using filenames with spaces, you'll have to jump through some hoops (e.g. using file transclusion to include alternate version for HTML vs LaTeX.

But it's much easier to avoid spaces in your image names.

mn4367 commented 9 years ago

If this is left unchanged in my opinion it should be documented that when using spaces in filenames

I get your point, but it isn't a nice solution in my opinion. Decorating filenames containing spaces with double quotes is not uncommon and, although not W3C compliant, it would be an easy and working solution for most locally used documents.

Of course one can try to avoid spaces in filenames, but filenames aren't always under the users control, for example if you generate a documentation with sources from a Git repo, like we do.

fletcher commented 9 years ago

Again, Markdown (and therefore MultiMarkdown) "thinks" in terms of URLs. It just so happens that a relative URL pointing to a file in the same directory as the source document looks like a filename (e.g. "foo.png" could be shorthand for "http://some.server/some/directory/foo.png").

LaTeX is fine with using a relative URL when it looks like a file path (e.g. "some/directory/foo.png"). So you get this behavior for free when converting to LaTeX. If your image is on a server, however, it would not get compiled into a PDF via LaTeX.

If you need more complicated URLs that include space/%20 characters, you'll have to come up with your own solution. Markdown and MultiMarkdown are designed to do most of what most users need, not all of what all users need.

My recommendation, if you can't change the original image filename, is to create a symlink using a "simple" name (e.g. "somefoo.png" points to "Some Foo.png"). Then you can use the same image in all formats.

mn4367 commented 9 years ago

Again, Markdown (and therefore MultiMarkdown) "thinks" in terms of URLs. It just so happens that a relative URL pointing to a file in the same directory as the source document looks like a filename (e.g. "foo.png" could be shorthand for "http://some.server/some/directory/foo.png").

I know that, but what is the relationship to spaces in filenames? The fact that the location of the image file is relative to the document is irrelevant in this case in my opinion. And MMD already handles this fine, no problem here. If the user gave a relative path like ../images/foo.png it just puts the filename verbatim into the output (I rather don't see the URL "thinking" since this isn't different to local files, but maybe I'm wrong here, if so, please excuse). And any browser will take care of resolving the full name, be it http:// or file:// based.

LaTeX is fine with using a relative URL when it looks like a file path (e.g. "some/directory/foo.png"). So you get this behavior for free when converting to LaTeX. If your image is on a server, however, it would not get compiled into a PDF via LaTeX.

If I output my example from above to LaTeX, no \includegraphics command is created. This is in fact my main complaint, the tag structure in HTML is broken and in LaTeX it's maybe not what one expects.

If you need more complicated URLs that include space/%20 characters, you'll have to come up with your own solution.

I'm a software developer, so for me this isn't really a problem, but for the average user it can be. That's why I proposed to at least document that spaces in filenames need special consideration/treatment.

Markdown and MultiMarkdown are designed to do most of what most users need, not all of what all users need.

I know that and I'm fine with that. But spaces in filenames are really not an obscure corner case, so this is not the case for all of what all users need.

But lets step back form the technical aspects for a moment. Suppose there is MMD/A and MMD/B, MMD/A is the one with the current behaviour and MMD/B allows spaces in filenames with my proposal from above (for inline definitions the double quotes probably wouldn't even be necessary, the round braces could serve as delimiters).

John Doe now asks "can I use images with spaces in their filenames?" and you answer "with MMD/B just put them between double quotes when using separate definitions; with MMD/A you can create a symlink without spaces, or you can rename the file or you can use transclusion or you can escape the filenames according to RFCxy". What would John's choice be?

Please don't get me wrong, I'm not asking for a feature here, I'm just trying to take the role of an advocate for average users and I think they always deserve software which is as easy to use as possible. And in my opionion this involves imposing as less restrictions as possible with regard to the use of filenames, be it spaces, special characters, relative paths, environment variables, drive letters, UNC pathnames or whatever.