typora / typora-issues

Bugs, suggestions or free discussions about the minimal markdown editor — Typora
https://typora.io
1.53k stars 56 forks source link

Absolute URIs still don't work in 0.9.31 #811

Closed pjeby closed 7 years ago

pjeby commented 7 years ago

As described in #700, Typora still does not launch markdown hyperlinks of the form [text](scheme:data), where scheme is anything outside a very narrow range, unless a // is included. Some URL schemes it doesn't launch include (but aren't limited to):

Rather than checking against a list of known URL schemes, Typora should simply launch anything that's a syntactically-absolute URI via the OS (with the exception of single-letter schemes on Windows, which could be a drive path). Per RFC 3986:

Scheme names consist of a sequence of characters beginning with a letter and followed by any combination of letters, digits, plus ("+"), period ("."), or hyphen ("-").

So if a hyperlink begins with a scheme name and a :, it can be considered an absolute URL and should be launched as such.

It also appears as though Typora is still looking for a // in a URL to determine if it is valid. This is erroneous, as most of the schemes I've listed above do not use // in them. Per RFC 2718:

Contrary to some examples set in past years, the use of double slashes as the first component of the of a URL is not simply an artistic indicator that what follows is a URL: Double slashes are used ONLY when the syntax of the URL's contains a hierarchical structure as described in RFC 2396. In URLs from such schemes, the use of double slashes indicates that what follows is the top hierarchical element for a naming authority. (See section 3 of RFC 2396 for more details.) URL schemes which do not contain a conformant hierarchical structure in their should not use double slashes following the ":" string.

So, mailto:, magnet:, evernote:, onenote:, etc. do not use // because they are not hierarchical in nature. (The same goes for many other custom URL schemes used by various Windows and OS X apps, making it harder to use Typora to reference data in these apps.)

(All that being said, I love 0.9.31's new file-browsing pane!)

abnerlee commented 7 years ago

I know what you mean now, so #700 is not fully fixed.

The difficult thing is that if user write [link](localhost:8080) or [link](my-wiki:400/index.html), we still want to add http:// automatically, since it is obviously that http:// is missing, but for [link](magnet:), no // should be pretend...

pjeby commented 7 years ago

On Windows, if the registry key "HKEY_CLASSES_ROOT/scheme/URL Protocol" exists, then "scheme" is a launchable URI scheme and can be invoked via the shell. I'm not as familiar with OS X, but this superuser question seems to indicate there is a LaunchServices .plist file that can be read to determine registered URL schemes there. On Linux, Gnome and KDE apparently have some mechanism to register these but it's not especially clear. So in principle, you could check for a valid scheme first and then fallback to adding http://.

All that being said, though: localhost:8080 and the rest of your examples are not valid URLs! They are not valid relative urls (which would require they begin with //), and they are not valid absolute urls (because they start with invalid schemes). From my POV, I would be perfectly happy with treating them as absolute URLs with a broken scheme, because other markdown editors do, and it's therefore somewhat unlikely that Typora users are relying on them being functional.

For example, github's markdown processor considers localhost:8080 to be an absolute URL with an unrecognized localhost: scheme... which it then refuses to render as a link because it's not on the URL scheme whitelist. But if you add the leading //, then it's treated as a relative URL that keeps the current https scheme. AFAIK, this is common behavior for markdown processors that accept untrusted user input and display it to other users (i.e., any modern web app that allows markdown input).

Markdown processors used in trusted-user scenarios don't do this filtering, which would leave the localhost:8080 open to browser interpretation... and browsers correctly interpret it as an absolute URL with a localhost: scheme again. :smile:

The only place where localhost:8080 is normally treated as an HTTP URL is in browser location bars, where some non-URL inputs are accepted. This may give the impression that localhost:8080 is a valid URL, even though it's not. In browser URL bars, the browser typically checks for the scheme registration and falls back to interpretation as a http URL.

Whew. So, yeah. The two options here are to emulate browser location bar behavior (which is both complex and highly platform-dependent), or browser link href behavior, which is simple, defined by RFC standards, and platform-independent. (IMO, matching the link behavior is also more accurate, since what's being generated is in fact a link href. :smile:)

abnerlee commented 7 years ago

I think this is fixed in v0.9.32

Please reopen this one, instead of creating new issue if it still have bugs

pjeby commented 7 years ago

It now works fine in the editor, but if you export as HTML and open the HTML in a browser, then links like localhost:8080 don't work. Firefox displays an error page, and Chrome refuses to click on them.

So, this new approach is teaching people their invalid links are fine (because Typora will open them), and then they will fail if people make HTML from it. So it would probably be better to display an error message saying that localhost: (or whatever) doesn't have an app available to open it, and suggesting that maybe they want to put an http:// or https:// on it. Then if the link works in Typora it should probably work in the browser.

(That being said, it now works for what I need, because I know that localhost isn't a valid URL and I'm not going to make that mistake. But it may come back and bite you later with support requests from people wondering why their links now work in Typora but not in their generated HTML.)

abnerlee commented 7 years ago

Exported HTML should have same behavior like typora, so this is a mirror bug either way.

The original intention is to make typora smart enough to auto-correct some common mistakes. But yes, something like markdown-lint can be implemented to show such errors directly.

pjeby commented 7 years ago

It gets worse. The current behavior already has an additional consistency problem: behavior of the same document across different machines!

Right now, if I write a document in Typora with a link to onenote:4040, then on my machine it's going to load a OneNote file. If you click the link on your machine without OneNote installed, it's going to open a web browser and try to go to onenote.com.

Whereas, if you generate an HTML file, then the behavior will be that on my machine it either works or is blocked (depending on browser settings), and on your machine the browser will either block it or display an error message saying that there's no app installed that does onenote: links.

The only way to make that consistent between markdown and HTML is to emulate the browser behavior and display an error message saying there's no app installed that does onenote: links. Or localhost: links, or whatever.

Because, if you change how the HTML is generated based on the installed apps, then people on two different machines will get different HTML from the same markdown, and at least one of those HTML documents will be wrong. (That is, if you try to stick in the http://, then you'll turn perfectly valid onenote: links (or whatever) into bad http links that won't work on any machine. Whereas leaving the links alone means they'll work on machines with the relevant app, and not on the ones that don't.)

Whew. Who'd've thought opening links in an app could be this complicated? :smile:

Well, actually, I guess you can simplify it quite a bit, following this approach:

  1. Check if the file is an absolute URL ( i.e., begins with a regex match to ^[A-Za-z][-.+A-Za-z0-9]+: -- that is, an RFC 2396-valid scheme of two or more characters followed by a :). If so, launch it via the OS's "launch a URL" facility, leaving the OS to display an error message if the app isn't found.
  2. If the path begins with a '/', refuse to open the link, because root and scheme-relative URLs aren't usable with file:-based URIs
  3. Treat the path as a filename relative to the current file's directory, i.e., add a / to the current file's directory, then add the path on without changing it in any other way. (Yes, this works on Windows, too, which silently changes / to \ in filenames.) If the file exists, open it, otherwise display an error message.

Deviating from the above steps or altering the URL in any way other than as described automatically means you are introducing either machine-to-machine inconsistencies or markdown-vs-html inconsistencies in link behavior. The above process is how browsers process link hrefs when reading a local HTML file. (For pages retrieved from the web, URLs beginning with a / are processed relative to the site, and ones starting with // are processed relative to the scheme. But Typora can't do that because it's working directly with files.)

Granted, browsers don't usually delegate the error message to the OS, they display one themselves... but Windows at least is kind enough to provide an appropriate error popup if you try to launch an invalid scheme, so there's little point in further filtering there. (Try hitting Win-R, then typing localhost:8080 in the run box and pressing enter... that'll pull up Windows' "There is no program associated to perform the requested action" message.)

abnerlee commented 7 years ago

should be fixed in latest build (0.9.37), RFC 2396-valid scheme will be followed

pjeby commented 7 years ago

If you open a link that begins with a /, 0.9.37 launches the web browser (which then does nothing, because the link is invalid in a file:// context, even though an HTML version might work if served by a webserver).

But the handling for registered schemes is definitely working, albeit without any error message.