sprig / org-capture-extension

A Chrome and firefox extension facilitating org-capture in emacs
MIT License
533 stars 56 forks source link

`org-protocol://capture` is cut off from URL #41

Open wjbg opened 6 years ago

wjbg commented 6 years ago

Hi @sprig,

Happy new year! Apologies for bothering you on this holiday, but I am sort of stuck making this work.

When I capture a web page in Firefox (both with and without selected text), Emacs opens a file starting with ?template=, followed by the template identifier and the site information. I checked the developer console and it provides the following information:

Capturing the following URI with new org-protocol: org-protocol://capture?template=L&url=https%3A%2F%2Fen.wikipedia.org%2Fwiki%2FOrg-mode&title=Org-mode%20-%20Wikipedia&body=,

which looks quite fine to me. Moreover, things work perfectly in case I send the URI directly to emacsclient from the shell, that is:

emacsclientw.exe "org-protocol://capture?template=L&url=https%3A%2F%2Fen.wikipedia.org%2Fwiki%2FOrg-mode&title=Org-mode%20-%20Wikipedia&body=", does provide the filled out capture template.

So it seems that, when called from Firefox, the start of the URI (i.e. org-protocol://capture) is cut off. The good news (well...) is that the error reproduces in Chrome in exactly the same manner. Any ideas how to resolve this issue?

Thanks in advance!

Here's my setup:

These are the capture templates:

 ("p" "Protocol" entry (file+headline 
        ,(concat org-directory "/inbox.org") "Web captures")
        "* %^{Title}\nSource: %u, %c\n #+BEGIN_QUOTE\n%i\n#+END_QUOTE\n\n\n%?") 
 ("L" "Protocol Link" entry (file+headline 
        ,(concat org-directory "/inbox.org") "Web captures")
        "* %? [[%:link][%(transform-square-brackets-to-round-ones \"%:description\")]]\n")

And here is the registry file I used to setup org protocol:

Windows Registry Editor Version 5.00

[HKEY_CLASSES_ROOT\org-protocol]
@="URL:Org Protocol"
"URL Protocol"=""

[HKEY_CLASSES_ROOT\org-protocol\shell]

[HKEY_CLASSES_ROOT\org-protocol\shell\open]

[HKEY_CLASSES_ROOT\org-protocol\shell\open\command]
@="\"c:\\Program Files (x86)\\emacs-25.3_1\\bin\\emacsclientw.exe\" \"%1\""
sprig commented 6 years ago

Hi @wjbg, happy new year to you as well!

From what you describe, I would guess that windows interpolates %1 to something other than the entire URI. I don't know if windows has a shell tool similar to open on OSX or xdg-open on linux (something that picks the appropriate software to handle URIs). If it does, I suggest trying it with the URI you gave org-protocol://capture?template=L&url=https%3A%2F%2Fen.wikipedia.org%2Fwiki%2FOrg-mode&title=Org-mode%20-%20Wikipedia&body=. Otherwise, maybe create a simple html file with a link to that URI and try clicking on it in FF/Chrome. I suspect the result will be the same as with the extension, indicating that the registry is indeed the problem.

I don't know if you could interpolate differently (e.g. via other numbered parameters) but perhaps a simple solution (if the problem is as I thought) would be to have in the registry instead

[HKEY_CLASSES_ROOT\org-protocol\shell\open\command]
@="\"c:\\Program Files (x86)\\emacs-25.3_1\\bin\\emacsclientw.exe\" \"org-protocol://capture%1\""
wjbg commented 6 years ago

Hi @sprig,

Thanks for your swift reply.

It appears that you can use start to handle the URIs from shell. As expected start "" "org-protocol://capture?template=L&url=https%3A%2F%2Fen.wikipedia.org%2Fwiki%2FOrg-mode&title=Org-mode%20-%20Wikipedia&body=" opens a file which name starts with ?template, as was also the case when called using the browser extension. I've tried your creative registry-hack, but that gave the exact same behavior.

The good news is that I was able to make it work, though. The following yields the expected behavior:

start "" "org-protocol:/capture?template=L&url=https://en.wikipedia.org/wiki/Org-mode&title=Org-mode&body="

Yes, that is with only one / after org-protocol:. Strangely emacsclientw.exe does not care if there is one or two slashes, it works in both cases.

Any ideas what is going on here?

sprig commented 6 years ago

Sounds very odd and unfortunately I have no idea :-(

wjbg commented 6 years ago

Hmm. Pity, but thanks for your time.

I checked whether there is a problem with the interpretation of %1 (by making org-capture-extension use msg.exe instead of emacsclientw.exe), but that is not the case. Also, the strange thing is that the links on https://orgmode.org/worg/org-contrib/org-protocol.html do work fine.

So, summarizing what we have different behavior depending on the exact URI.

start "" "org-protocol:/capture?template=L&url=https://en.wikipedia.org/wiki/Org-mode&title=Org-mode&body=" yields the expected behavior.

start "" "org-protocol://capture?template=L&url=https://en.wikipedia.org/wiki/Org-mode&title=Org-mode&body=" does not. It looks like the URI is cut in pieces.

Using the first link explicitly in the registry also works fine. Lastly, 'emacsclient.exe' works on both links.

Well, if anyone can help out and enlighten me, that would be appreciated.

dilzeem commented 6 years ago

Hi @wjbg,

I was wondering have you gotten this to work on Windows OS? I currently use spacemacs, and spent the last few hours trying to get this to work to no avail.

Currently, I am getting a blank frame when I call this. I will try fiddle around with it tomorrow, and let you know if I come up with anything.

rgemulla commented 6 years ago

I got it working with Firefox and Windows 10. My problem was #31 (which looks in emacs like the capture part is cut off, which it acutally isn't). My solution was to create a batch file (content below) and register that batch file as org-protocol handler. The file essentially adds another / and does some quoting.

@echo off
set URL=%1
set URL=%URL:&=^&%
set URL=%URL:/?=?%
set URL=%URL:://=:///%
"c:\Program Files\emacs\bin\emacsclientw.exe" -na "c:\Program Files\emacs\bin\runemacs.exe" "%URL%"
dilzeem commented 6 years ago

@rgemulla thanks for the info. still haven't got it working, but I think my issue is how I load it into spacemacs.

I have tried running the following in command line:

start "" "org-protocol:/capture?template=L&url=https://en.wikipedia.org/wiki/Org-mode&title=Org-mode&body="

start "" "org-protocol://capture?template=L&url=https://en.wikipedia.org/wiki/Org-mode&title=Org-mode&body="

start "" "org-protocol:///capture?template=L&url=https://en.wikipedia.org/wiki/Org-mode&title=Org-mode&body="

with none of them really working

wjbg commented 6 years ago

Hi @rgemulla and @dilzeem,

Somewhat comforting to learn that I am not the only one struggling. I eventually got it working, although not in a way I wanted. I downloaded the source code and changed capture.js to return org-protocol:/capture... (so with one slash), went through the Mozilla signing process and created an extension for personal use that actually works. Although it works, I am not too happy as obviously this way the extension is not maintained or updated. I may want to try the approach @rgemulla proposed.

I have no experience in Spacemacs, so cannot offer any advise to @dilzeem. Googling on org-protocol spacemacs gave me this https://gist.github.com/cjp/64ac13f5966456841c197f70c7d3a53a. Perhaps it is useful.

sprig commented 6 years ago

Hi all,

@wjbg I actually started working a while ago on an enhancement that does what you want (essentially adding an "advanced" option that would allow users to select define their own capture URL template). I will eventually finish, but so far have lacked the time to do so.

You can check it out and of course are welcome to contribute: https://github.com/sprig/org-capture-extension/tree/issue-32-33

sprig commented 6 years ago

@dilzeem I looked now at the gist that @wjbg posted and it looks to be the correct way to initialize org-protocol (not just on windows). Are you doing something similar? Have you tried looking at the *Messages* (C-h e) buffer after these blank frames open?

dilzeem commented 6 years ago

@sprig Thanks for assistance. Messages buffer had nothing, so it is kind of hard to debug. I think its the way I start my server. Since when my computer starts up I automatically run emacs as a daemon.

I also tried to follow the proper way to use org-protocol in spacemacs as outlined here: https://github.com/syl20bnr/spacemacs/issues/3895

But got nothing to work. So for the time being I will work some other things and hopefully improve my emacs debugging skills to figure this out at some later time.

wjbg commented 6 years ago

@sprig The enhancement sounds like a nice way forward; especially the option to define your own templates sounds appealing to me. I'll check it out and see whether I can contribute (no java experience though...).

miabrahams commented 6 years ago

I also found that modifying capture.js to send only one slash in "org-protocol:/capture?..." was necessary to get this working on Windows.

Does formatting the URI with a single slash work on other platforms? If it does, making that simple change would probably be more valuable than spending a lot of time working on custom template code.

(Both of these commands work for me with either one or two slashes.)

> C:\path\to\emacsclientw.exe "org-protocol:/capture?template=p&url=www.github.com&title=Link&body=test"
> start "org-protocol:/capture?template=p&url=www.github.com&title=Link&body=test"
sprig commented 6 years ago

The problem is that this is hard to know - since there are many variations of the same platforms. FWIW, a single slash works on my computer. I will probably add an option to select # of slashes, since this is probably the most requested feature...

josejaviernieto commented 5 years ago

Hello, Somebody found a solution?

rgd commented 5 years ago

I'm seeing this same problem (Windows 7, Emacs 27.0.50, Org 9.1.14) I found this post: https://lists.gnu.org/archive/html/emacs-orgmode/2017-08/msg00587.html and when I debugged into org-protocol-check-filename-for-protocol, I saw the same thing, that somewhere between what Chrome was sending and emacsclientw sent to Emacs, the "capture?" part was changed into "capture/?" and for me that causes it to fail on line 619 or org-protocol.el trying to match the fname passed from Chrome with a regexp not expecting "/?" but just "?".
And Nickolay's reply seems to be the same answer mentioned here. I have little idea why it might be happening but I'd suspect some effect in Windows' handling of the string between Chrome and emacsclientw or emacsclientw and Emacs. Something not seen on other OS's.
I haven't tried to figure out how to modify your extension to test this, but wanted to let you know it's being used and people are still seeing this issue. If I can't figure out how to change the extension, I can use bookmarklets for the time being. Thanks for the extension though - when it was working it was great!

rgd commented 5 years ago

Oh - if I change the org-protocol handler from emacsclient to a batch file that prints out its parameters, I can see the "capture/?" is coming out of Chrome and/or the extension before it gets into emacsclient or Emacs. So I would think it's maybe an issue of Chrome, Chrome on Windows, extensions in Chrome. Something like that.

sprig commented 5 years ago

Thanks for the update! I'm sorry that it doesn't work for you and other (windows) users. I did not have time to work on it for several months, but hopefully the situation will improve soon. Frankly though, this is OSS and as I've said before I would welcome a PR.

But your investigation reminded me of this comment; https://github.com/sprig/org-capture-extension/issues/41#issuecomment-359195905 - this seems to be a simple enough solution - did you try it?

Either way, while reviewing the URI RFC I realized that even a single / is not necessarily mandatory. I think it might be an even better/more robust solution to completely remove //, potentially. Comments?

emakei commented 5 years ago

Thanks for the update! I'm sorry that it doesn't work for you and other (windows) users. I did not have time to work on it for several months, but hopefully the situation will improve soon. Frankly though, this is OSS and as I've said before I would welcome a PR.

But your investigation reminded me of this comment; #41 (comment) - this seems to be a simple enough solution - did you try it?

Either way, while reviewing the URI RFC I realized that even a single / is not necessarily mandatory. I think it might be an even better/more robust solution to completely remove //, potentially. Comments?

There is problems. Window just opens and closes immediately. In Messages: Loading quail/cyrillic...done Starting Emacs daemon. Greedy org-protocol handler. Killing client. No server buffers remain to edit Deprecated date/weektree capture templates changed to ‘file+olp+datetree’. Greedy org-protocol handler. Killing client. No server buffers remain to edit Deprecated date/weektree capture templates changed to ‘file+olp+datetree’. Greedy org-protocol handler. Killing client. No server buffers remain to edit Deprecated date/weektree capture templates changed to ‘file+olp+datetree’.

P.S. without this i see - http://prntscr.com/ln39oa

emakei commented 5 years ago

And when i'm triyng example from org-protocol.el, then i see:

Warning (emacs): Please update your Org Protocol handler to deal with new-style links.

coltoneakins commented 5 years ago

------------------------------------------------------------------------------------- Let's chat peeps.

Background and Environment

Here is my .spacemacs config:

  ;; Add org-protocol for org-capture
  (require 'server)
  (unless (server-running-p)
    (server-start));;for emacs client
  (require 'org-protocol)

  ;; org-capture templates
  (require 'org-capture)
  (use-package org-protocol
    :demand
    :config
    (add-to-list 'org-capture-templates
                 '("p" "Protocol" entry (file "")
                   "* %?[[%:link][%:description]] %U\n%i\n" :prepend t))
    (add-to-list 'org-capture-templates
                 '("L" "Protocol Link" entry (file "")
                   "* %?[[%:link][%:description]] %U\n" :prepend t)))

Here is the file I am using to add to the registry in Windows 10 with the new WSL:

REGEDIT4

[HKEY_CLASSES_ROOT\org-protocol]
@="URL:Org Protocol"
"URL Protocol"=""
[HKEY_CLASSES_ROOT\org-protocol\shell]
[HKEY_CLASSES_ROOT\org-protocol\shell\open]
[HKEY_CLASSES_ROOT\org-protocol\shell\open\command]
@="\"C:\\Windows\\System32\\wsl.exe\" emacsclient  \"%1\""

Note: The last line of your *.reg file will be different if you installed the Windows version of Emacs. I am using Debian via the WSL in Windows 10 which allows me to invoke Linux commands via the wsl.exe on Windows. See: https://docs.microsoft.com/en-us/windows/wsl/interop

What is causing this issue?

This extension is not creating links properly for Emacs to understand. Specifically, this extension is not making the new-style (query-style) links properly that were introduced in Version 9.0 of Org-mode. See: https://orgmode.org/Changes_old.html

As mentioned in the changelog for Org-mode Version 9.0, this is what a new-style links are suppose to look like:

org-protocol://capture?template=x&title=Hello&body=World&url=http:%2F%2Fexample.com This is an example of a link that this extension is sending to Emacs. This extension is trying to make a new-style link, but it is not properly doing so.:

org-protocol://capture/?template=L&url=https%3A%2F%2Fwww.google.com%2F&title=Google&body= Error: There should not be a forward slash following the word capture in the URL.

The good news is the Org-mode still supports the old-style of links, and this extension can create the old-style links if you enable it in the settings for the extension.

This problem has nothing to do with the number of leading forward slashes after org-protocol:// that is mentioned in these two comments above: https://github.com/sprig/org-capture-extension/issues/41#issuecomment-355296101 https://github.com/sprig/org-capture-extension/issues/41#issuecomment-359250124

Solutions

Here are your options in order from most easy to most difficult:

  1. Turn off the option "Use New-Style links? (Recommended for Org-Mode 9.0+)" in the options for this extension. Just right click the icon for this extension; then, hit "Options". Uncheck the box that specifies to use the new-style links.

  2. Create your own custom bookmarklet or user-script (JavaScript) as mentioned in https://orgmode.org/worg/org-contrib/org-protocol.html#org6321f37 to make a special bookmark in your bookmarks bar that will properly constructs links for Emacs.

  3. As brought up in the comments above, download the source code for this Chrome extension. Then, make the necessary changes to the JavaScript in capture.js in the source code. Then, open your private version of the extension in Chrome with the developer options enabled under the 'Manage Extensions' page.

-------------------------------------------------------------------------------------

sprig commented 5 years ago

Thanks for the thorough analysis, although I take issue with your conclusion. The most easy solution, if you are right, is to submit your 1 letter fix PR and have everyone enjoy the fruits of your discovery without going through the big reports.

coltoneakins commented 5 years ago

Fair point. However, I was unable to determine what was causing the extension to form links improperly. This line in capture.js looks fine to me: https://github.com/sprig/org-capture-extension/blob/034cc563e66511a8ed8fe7df5b43124d9719a0d2/capture.js#L33

But, I know that the links are improper. Not sure how the extra forward slash / is being thrown in; but it is. I verified this because I changed by org-protocol handler on Windows to instead echo the URL back to me using this file to edit my registry:

REGEDIT4

[HKEY_CLASSES_ROOT\org-protocol]
@="URL:Org Protocol"
"URL Protocol"=""
[HKEY_CLASSES_ROOT\org-protocol\shell]
[HKEY_CLASSES_ROOT\org-protocol\shell\open]
[HKEY_CLASSES_ROOT\org-protocol\shell\open\command]
@="\"C:\\Windows\\System32\\wsl.exe\" echo \"%1\" && sleep 5 "

When the URLs are shown back to me after triggering the extension, they have the extra forward slash.

sprig commented 5 years ago

That’s also a fair point. Forgive me for being slightly skeptical since I see you went through a lot of trouble debugging it. It’s just that if you go through previous but reports you will see that various utilities sometimes screw up the url, including windows sometimes.

Would you mind also trying using the extension with logging enabled, and report what is shown in the JavaScript console?

Pfedj commented 5 years ago

Hello, are there any other suggestions? The above proposals did not work for me.

aaronbieber commented 4 years ago

I just hit this issue and solved it in the same way that @wjbg did (creating my own modified self-signed extension that uses only a single forward slash). I think I can add a little more information and a little bit of speculation here.

I am using Windows 10 and Firefox. I discovered that if you attempt to visit a URL like protocol://resource?param=value, Firefox will "helpfully" append a trailing forward slash, like so: protocol://resource/?param=value, which is in keeping with (at least) the HTTP spec (I didn't review the URI spec). This appears to break org capture, and I confirmed this by testing both variations from the command line.

The client "helpfully" expands the name and tries to open c:/Windows/system32/org-protocol:/capture/?template=... (the base path is wherever emacsclientw was called from, and seems to fall into system32 if called from the registry handler.

Removing one of the forward slashes, which results in org-protocol:/capture?template=... avoids the trailing slash mangling, presumably because the form no longer matches a proper URI.

I concluded from this that the issue here is not a bug in this software, or in Windows (well, not necessarily, read on), but rather that Emacs doesn't understand and parse the trailing slash after the resource name in keeping with (at least) the HTTP spec. Windows is implicated because it doesn't seem that org-protocol:/resource should even be recognized as a protocol to be handled, but for whatever reason, Windows will execute your registry-defined handler anyway.

I haven't yet been able to track down where in the elisp code the protocol name and resource name are handled, but if we can identify a location where that trailing slash could be accounted for, I think it would solve all of the problems described here.

Edit: I located where the protocol string is processed, see org-protocol.el line 613 (in org-protocol-check-filename-for-protocol). You'll note that the regexp is constructed by appending :/+ to the protocol name (org-protocol), so Emacs can handle any number of forward slashes after the colon.

Edit 2: The line that needs to change to allow for Firefox's inserted trailing slash is 619 in the same file, although doing so will break the new-style detection on line 625 also, so this will require more effort to get working. I think that I have provided enough information here to at least exonerate this project as the true root cause, even if manipulating the number of slashes gets us to a working spot.

fpiper commented 3 years ago

This is now fixed in org. Starting with version 9.4 org mode recognizes the form protocol://resource/?param=value. So this should not be a problem anymore.