jkitchin / org-ref

org-mode modules for citations, cross-references, bibliographies in org-mode and useful bibtex tools to go with it.
GNU General Public License v3.0
1.36k stars 244 forks source link

"doi-add-bibtex-entry" results in "End of file while parsing JSON" #473

Closed jack836 closed 3 years ago

jack836 commented 7 years ago

I tried to add bibtex entries to my .bib file using 'doi-add-bibtex-entry' but i always get a "End of file while parsing JSON" message. But when I do (on the terminal) curl -LH "Accept: application/citeproc+json" "http://doi.org/10.1021/jp511426q"

I get the following output

{"indexed":{"date-parts":[[2017,6,23]],"date-time":"2017-06-23T05:46:07Z","timestamp":1498196767565},"reference-count":52,"publisher":"American Chemical Society (ACS)","issue":"9","funder":[{"DOI":"10.13039\/100000078","name":"Division of Materials Research","doi-asserted-by":"publisher","award":["DMR 0843934"]},{"DOI":"10.13039\/100006151","name":"Basic Energy Sciences","doi-asserted-by":"publisher","award":["DE-SC0004031"]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["J. Phys. Chem. C"],"published-print":{"date-parts":[[2015,3,5]]},"DOI":"10.1021\/jp511426q","type":"journal-article","created":{"date-parts":[[2015,2,10]],"date-time":"2015-02-10T03:10:55Z","timestamp":1423537855000},"page":"4827-4833","source":"Crossref","is-referenced-by-count":0,"title":"A Linear Response DFT+UStudy of Trends in the Oxygen Evolution Activity of Transition Metal Rutile Dioxides","prefix":"10.1021","volume":"119","author":[{"given":"Zhongnan","family":"Xu","affiliation":[]},{"given":"Jan","family":"Rossmeisl","affiliation":[]},{"given":"John R.","family":"Kitchin","affiliation":[]}],"member":"316","container-title":"The Journal of Physical Chemistry C","original-title":[],"deposited":{"date-parts":[[2017,6,23]],"date-time":"2017-06-23T05:14:39Z","timestamp":1498194879000},"score":1.0,"subtitle":[],"short-title":[],"issued":{"date-parts":[[2015,3,5]]},"references-count":52,"alternative-id":["10.1021\/jp511426q"],"URL":"http:\/\/dx.doi.org\/10.1021\/jp511426q","relation":{},"ISSN":["1932-7447","1932-7455"],"issn-type":[{"value":"1932-7447","type":"print"},{"value":"1932-7455","type":"electronic"}],"subject":["Energy(all)","Physical and Theoretical Chemistry","Electronic, Optical and Magnetic Materials","Surfaces, Coatings and Films"]}`

Debugging further using org-ref-debug results in

#+TITLE: org-ref debug

org-ref: Version 1.1.1

* Variables
1. org-ref-completion-library: org-ref-helm-bibtex
2. org-ref-bibliography-notes: ~/bibliography/notes.org (exists t)
3. org-ref-default-bibliography: (~/bibliography/references.bib) (exists (t)) (listp t)
4. org-ref-pdf-directory: ~/bibliography/bibtex-pdfs/ (exists t)

* System
system-type: System: gnu/linux
system-configuration: x86_64-unknown-linux-gnu
window system: Window system: x
Emacs: GNU Emacs 26.0.50.1 (x86_64-unknown-linux-gnu, GTK+ Version 2.24.25)
 of 2017-07-07
org-version: 8.2.10

* about org-ref
org-ref installed in /home/jackin/.emacs.d/elpa/org-ref-20170626.1834/org-ref.elc.

** Dependencies
helm-bibtex /home/jackin/.emacs.d/elpa/helm-bibtex-20170321.1306/helm-bibtex.elc

* org-ref-pdf (loaded: t)
system pdftotext: /usr/bin/pdftotext
You set pdftotext-executable to pdftotext (exists: /usr/bin/pdftotext)

* org-ref-url-utils (loaded: nil)

* export variables
org-latex-pdf-process:
("pdflatex -interaction nonstopmode -output-directory %o %f" "pdflatex -interaction nonstopmode -output-directory %o %f" "pdflatex -interaction nonstopmode -output-directory %o %f")

My configuration file looks like

;;;;;;;;;;;;;;;;;;;;;;;;;
;;PAckage;;
;;;;;;;;;;;;;;;;
(require 'package)
(add-to-list 'package-archives '("org" . "http://orgmode.org/elpa/") t)
(setq package-archives '(("melpa" . "https://melpa.org/packages/")
             ("gnu" . "http://elpa.gnu.org/packages/")
                          ("marmalade" . "http://marmalade-repo.org/packages/")))
(package-initialize)

;;;;;;;;;;;;;;;;;;;;
;;;ORG-REF;;;;;;;;;
;;;;;;;;;;;;;;;;;;

(setq reftex-default-bibliography '("~/bibliography/references.bib"))

;; see org-ref for use of these variables
 (setq org-ref-bibliography-notes "~/bibliography/notes.org"
       org-ref-default-bibliography '("~/bibliography/references.bib")
       org-ref-pdf-directory "~/bibliography/bibtex-pdfs/")

(setq bibtex-completion-bibliography "~/bibliography/references.bib"
      bibtex-completion-library-path "~/bibliography/bibtex-pdfs"
      bibtex-completion-notes-path "~/bibliography/helm-bibtex-notes")

(setq org-latex-prefer-user-labels t)

(require 'doi-utils)
(require 'org-ref-url-utils)
(require 'org-ref-bibtex)
(require 'org-ref-pdf)
(require 'org-ref-latex)
(require 'org-ref)

When I try doi-insert-bibtex DOI then I get a Error while processing request: (biblio--url-error . timeout)

I am using GNU Emacs 26.0.50.1. All worked fine under the same version of Emacs until last month. org-ref suddenly stopped fetching the bibtex entries now. I am not sure what is being screwed up. I also tried with a fresh install of Emacs and org-ref, but same result. Any help / pointers regarding this is highly appreciated.

jkitchin commented 7 years ago

I am not able to reproduce this. What is the output of (doi-utils-debug "10.1021/jp511426q")?

jack836 commented 7 years ago

Thank you for the quick response. It seems the issue is taking different direction now and may not have anything to do with org-ref

First of all, (doi-utils-debug "10.1021/jp511426q") results only in doi:10.1021/jp511426q and two empty lines in a buffer *debug-doi*

Getting curious, I moved my PC out of my institute network (since I am behind a proxy) and org-ref started to successfully extract bibliography again. That easily leads to the conclusion that the issue is with the proxy. But the following is what itches my brain

I had issues with proxy (and emacs) when I started working at this institute, and was not able to use my favorite tools 'google-translate' , 'eww' and 'org-ref' since they connect through 'HTTPS'. Then I found it was bug in emacs and used this patch http://emacs.1067599.n8.nabble.com/bug-11788-url-http-does-not-properly-handle-https-over-proxy-td46070.html#a382353 and all my tools including 'org-ref' started working fine. Now 'google-translate' and 'eww' continues to work fine but only 'org-ref' refuses to fetch information.

When I do a M-x eww doi.org/10.1021/jp511426q (from behind the proxy) it takes me to the correct page as any other browser would do.

This makes me believe that I am missing something else. Since I am new to elisp I am not aware of any other ways of debugging OR to generate more verbose outputs to identify what fails. Any help in this direction is highly appreciated.

It also makes me wonder if I am discussing an irrelevant topic here. If so I am sorry and please let me know.

jkitchin commented 7 years ago

It does sound like proxy related issues. I don't know much about these and don't have a way to test them either :(

jack836 commented 7 years ago

I strongly suspect the proxy too. The network administration opens up / closes down certain ports from time to time and also makes other changes. org-ref could have been caught behind on of those. I will continue trying different things and will post if something interesting shows up.

@jkitchin I sincerely appreciate all your efforts for 'org-ref', I like it very much. Being a zotero user for a long time, I now settled down to a very efficient work flow using org-ref. I will continue to use the other features of 'org-ref' , but will have to look for some other options to fetch .bib (for the time being).

chemmi commented 4 years ago

This is an old conversation, but I ran into the same problems using org-ref:

When I am behind my company proxy, I cannot fetch json metadata via doi-utils, but curl works fine. It seems like url-retrieve-synchronously returns empty data (except for a newline).

@jack836 Could you further nail down the problem or find a workaround for this?

jkitchin commented 4 years ago

One possibility is to replace the code that uses url-retrieve-synchronously with code that uses a shell command with curl. If that works, it could at least be a documented solution.

jack836 commented 4 years ago

@chemmi sorry to hear about the issue you are facing. Back then, I solved (got through) the issue by setting up another network (without proxy) and I kept switching between networks when I used org-ref. I didn't like the solution and but then learned to live with it.
Now I work in a different place where I am not behind a proxy. So I am afraid I may to able to test things from my side. But, since you have narrowed down to the exact function url-retrieve-synchronously , I think it is now only a matter of finding the replacement. What @jkitchin suggested could be a good starting point. Another possibility will be to look into how other packages like 'google-translate' scrape down the data. I am still a beginner with 'elisp' and hence could not help much here. Though I will not be using it, I still will be happy to see a solution to the issue (which had haunted me for a while).

chemmi commented 4 years ago

Thanks for your replies!

I have hacked a (temporary) walkaround using curl directly via call-process. Somehow request.el did not work either...

Pasting the following into my .emacs (overwriting doi-utils-get-json-metadata) works for me:

(require 'org-ref)
(defun doi-utils-get-json-metadata (doi)
  "Try to get json metadata for DOI.  Open the DOI in a browser if we do not get it."
  (let (;;(url-request-method "GET")
        ;;(url-mime-accept-string "application/citeproc+json")
        (json-object-type 'plist)
        (json-data)
    (url (concat doi-utils-dx-doi-org-url doi)))
    (with-temp-buffer
      (call-process "curl" nil t nil
                    "--location"
                    "--silent"
                    "--header"
                    "Accept: application/citeproc+json"
                    url
                    )
      (setq json-data (buffer-string))
      (cond
       ((or (string-match "<title>Error: DOI Not Found</title>" json-data)
        (string-match "Resource not found" json-data)
        (string-match "Status *406" json-data)
        (string-match "400 Bad Request" json-data))
    (browse-url url)
    (error "Something went wrong.  We got this response:
%s
Opening %s" json-data url))
jkitchin commented 4 years ago

I added an option to use a variant of the curl function described above. See the new variable doi-utils-metadata-function which can be set to use a curl function now. It is somewhat lightly tested, so confirmation it works for you would be appreciated.

chemmi commented 4 years ago

Thanks for that! I integrated it in my setup and in a first test everything worked as expected. If I will run into some problems I will notify via this (closed) issue.

vnckppl commented 3 years ago

FYI. I am experiencing the same problem as described in the first line of the first post and setting:

(setq doi-utils-metadata-function 'doi-utils-get-json-metadata-curl)

as suggested here fixes this.