tmalsburg / helm-bibtex

Search and manage bibliographies in Emacs
GNU General Public License v2.0
462 stars 74 forks source link

Can't open pdf when file path cotains western european character #152

Open qyin opened 8 years ago

qyin commented 8 years ago

Mendeley will escape every non-ascii character in the bib file it generated.The following is a small example generated by Mendeley.

@article{Moser2010,
author = {Moser, Robin A. and Tardos, G{\'{a}}bor},
doi = {10.1145/1667053.1667060},
file = {:home/qiang/Documents/Mendeley Desktop/Journal of the ACM/2010/2010 - A constructive proof of the general lov{\'{a}}sz local lemma.pdf:pdf},
issn = {00045411},
journal = {J. ACM},
keywords = {Constructive proof,Lov{\'{a}}sz local lemma,parallelization},
month = {jan},
number = {2},
pages = {1--15},
publisher = {ACM},
title = {{A constructive proof of the general lov{\'{a}}sz local lemma}},
url = {http://dl.acm.org/citation.cfm?id=1667053.1667060},
volume = {57},
year = {2010}
}

Note that the file name of "2010 - A constructive proof of the general lovász local lemma.pdf" is tranformed to "2010 - A constructive proof of the general lov{\'{a}}sz local lemma.pdf". And as a result helm-bibtex can't find the related pdf file.

tmalsburg commented 8 years ago

I'd file a bug with Mendeley. To my knowledge there is no reason to escape these characters. I use all kinds of unicode glyphs in my bibliography and never had a problem.

(Sent from my cell phone.)

On Oct 25, 2016, at 6:37 PM, Qiang Yin notifications@github.com wrote:

Mendeley will escape every non-ascii character in the bib file it generated.The following is a small example generated by Mendeley.

@article{Moser2010, author = {Moser, Robin A. and Tardos, G{\'{a}}bor}, doi = {10.1145/1667053.1667060}, file = {:home/qiang/Documents/Mendeley Desktop/Journal of the ACM/2010/2010 - A constructive proof of the general lov{\'{a}}sz local lemma.pdf:pdf}, issn = {00045411}, journal = {J. ACM}, keywords = {Constructive proof,Lov{\'{a}}sz local lemma,parallelization}, month = {jan}, number = {2}, pages = {1--15}, publisher = {ACM}, title = {{A constructive proof of the general lov{\'{a}}sz local lemma}}, url = {http://dl.acm.org/citation.cfm?id=1667053.1667060}, volume = {57}, year = {2010} } Note that the file name of "2010 - A constructive proof of the general lovász local lemma.pdf" is tranformed to "2010 - A constructive proof of the general lov{\'{a}}sz local lemma.pdf". And as a result helm-bibtex can't find the related pdf file.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.

qyin commented 8 years ago

Hi @tmalsburg, thanks for the quick reply.

The reason I guess is BibTeX is not compatible with unicode. BibTeX is around since the mid 80s, and it dose not handle unicode. The current version is 0.99d.

$ bibtex --version
BibTeX 0.99d (TeX Live 2016)
kpathsea version 6.2.2

Yes, there are other options to work around this as you said, e.g. one can use biber, a replacement of BibTeX that can handle unicode very well. The point is that there may still many people use BibTeX but not biber. This is on one hand due to old habit, and one the other hand that some LaTeX packages have an explicit dependence on BibTeX itself and will not work with biber.

So I suggest for the sake of compatibility, helm-bibtex can add some function to transform the escaped characters back to unicode in the file filed.

tmalsburg commented 8 years ago

I'vs been using bibtex with utf-8 encoded bib files for many years and it worked flawlessly. All you have to do is to include these lines in the header of your latex document:

\usepackage[utf8]{inputenc}
\usepackage[T1]{fontenc}

Here is an example document. The latex file (manuscript.tex):

\documentclass[apacite]{apa6}
\usepackage[utf8]{inputenc}
\usepackage[T1]{fontenc}
\begin{document}
Test
\cite{MetznerEtAl2016}
\bibliography{manuscript}
\end{document}

The bib file (manuscript.bib):

@article{MetznerEtAl2016,
  author = {Metzner, Paul and von der Malsburg, Titus and Vasishth, Shravan and Rösler, Frank},
  title = {The importance of reading naturally: {Evidence} from combined recordings of eye movements and electric brain potentials},
  year = {2016},
  journal = {Cognitive Science},
  pubstate = {inpress},
  keywords = {ERP, eye movements, reading, coregistration, n400, p600},
  abstract = {How important is the ability to freely control eye movements for reading comprehension?  And how does the parser make use of this freedom?  We investigated these questions using coregistration of eye movements and event-related brain potentials (ERPs) while participants read either freely or in a computer-controlled word-by-word format (also known as RSVP).  Word-by-word presentation and natural reading both elicited qualitatively similar ERP effects in response to syntactic and semantic violations (N400 and P600 effects).  Comprehension was better in free reading but only in trials in which the eyes regressed to previous material upon encountering the anomaly.  A more fine-grained ERP analysis revealed that these regressions were strongly associated with the well-known P600 effect.  In trials without regressions, we instead found sustained centro-parietal negativities starting at around 320 ms post-onset, however, these negativities were only found when the violation occurred in sentence-final position.  Taken together, these results suggest that the sentence processing system engages in strategic choices: In response to words that don’t match built-up expectations, it can either explore alternative interpretations (reflected by regressions, P600 effects, and good comprehension) or pursue a "good-enough" processing strategy that tolerates a deficient interpretation (reflected by progressive saccades, sustained negativities, and relatively poor comprehension).},
}

Not the non-ascii character in “Rösler”.

qyin commented 8 years ago

Hi @tmalsburg, thanks for the tip and example.

Yes, with these two lines BibTeX works very well.

So I guess this won't be fixed in helm-bibtex.

tmalsburg commented 8 years ago

So I guess this won't be fixed in helm-bibtex.

I didn't say this. I just wanted to argue against the premise that we have to fix this. For me personally it's not a priority and I still think a fix in Mendeley would be preferable. But if someone makes a PR for this, I'd definitely consider including it.

qyin commented 8 years ago

OK, that's fair enough.