retorquere / zotero-better-bibtex

Make Zotero effective for us LaTeX holdouts
https://retorque.re/zotero-better-bibtex/
MIT License
5.37k stars 289 forks source link

List of escaped characters & how to avoid the escaping #315

Closed gracile-fr closed 9 years ago

gracile-fr commented 9 years ago

Some characters are escaped on export by BBT, but I can't find the exact list. Moreover, I'm not sure of the rationale behind that. I imagine that's to allow users to put some LaTeX code in fields, and preserve that on export, right? But how can we "escape the escaping"? (I heavily use in my Zotero tags the hash # and the underscore _).

retorquere commented 9 years ago

The list is derived from http://www.w3.org/Math/characters/unicode.xml . The default way to avoid escaping is to turn off ascii mode for the translator. It's off by default for BibLaTeX, on by default for BibTeX.

retorquere commented 9 years ago

The main rationale is that a BBT bibliography should yield the same(ish) output the references would produce through zotero itself. You might want to look at the "going hardcore" section of the BBT site.

gracile-fr commented 9 years ago
The default way to avoid escaping is to turn off ascii mode for the translator. It's off by default for BibLaTeX,
It's off indeed.
The main rationale is that a BBT bibliography should yield the same(ish) output the references would produce through zotero itself
Yes I actually agree with that. I think I had forgotten / I didn't know that the hash and the underscore must be escaped to be treated literally by BibLaTeX. Thanks for the clarification! I was confused because, in some cases, we need to avoid the escaping, i.e. for BBT syntax. Any idea about #313 ?
gracile-fr commented 9 years ago

Do the reasons to escape these chars apply to tags name (keywords)? I had a bunch of tags using or beginning by _ , I changed that (I now use - instead) but I wonder if escaping is really needed for them? Tags names don't appear in citations or bibliographies, they have another purpose (e.g. filter entries in bibliographies).

retorquere commented 9 years ago

I guess you're right. The BibLaTeX manual isn't exceedingly clear on this point -- it says keywords are "literal values", but doesn't say it's verbatim either as it does for others.

retorquere commented 9 years ago

On the other hand, sharelatex doesn't like this MWE at all:

\RequirePackage{filecontents}
\begin{filecontents*}{\jobname.bib}
@online{zotero-1855286-1414,
  title = {Patent Landscape analysis: sample from {S}40RTS},
  url = {http://www.patanalyse.com/sample-analysis.html},
  timestamp = {2014-10-18 17:04:22},
  urldate = {2014-10-18},
 author = {Fran\c{c}ois H\'edelin abb\'e d' Aubignac} ,
 keywords = {#1, _2, %3}
}
\end{filecontents*}
\documentclass[]{article}
\usepackage[autostyle]{csquotes}
\usepackage[
    backend=biber,
    style=authoryear-icomp,
    sortlocale=de_DE,
    natbib=true,
    url=false, 
    doi=true,
    eprint=false
]{biblatex}
\addbibresource{\jobname.bib}
\usepackage[]{hyperref}
\hypersetup{colorlinks=true,}
\begin{document}
\nocite{*}
\printbibliography 
\end{document}
gracile-fr commented 9 years ago

Indeed, but I can compile this one: keywords = {\#1, _2, \%3} without any problem. (Just out of curiosity: you seem to use sharelatex; but what's your editor off-line on Windows?)

retorquere commented 9 years ago

OK, that shouldn't be too hard to do.

I use sharelatex for checks on MWEs, not for writing; I don't trust cloud providers or Vodafone sufficiently to commit to an online-first workflow, all my work is online-first + sync.

I don't currently have a Windows machine active, but I use vim pretty much everywhere. I have tried a few IDEs, also on Windows, but I miss my vi keybindings. The one thing that sometimes make me curious about IDEs is the facility to move whole sections of text at once, but none of the IDEs I've found so far do that well enough to justify going without vi keybindings.

I've tried emacs, but never could wrap my head around it, even with the vi keybindings.

gracile-fr commented 9 years ago

Don't bother you with "un-escaping" the underscore in keywords, I've changed all my tags already :-)

gracile-fr commented 9 years ago

Emacs with AUC TeX seems very good, but I'm just using notepad++ (or texmaker). I'd really like to have very good syntax highlighting, folding footnote/footcite, document outline…

retorquere commented 9 years ago

Too late, tests are running.

I've tried Emacs but I didn't even get to AUCTex; just using emacs is enough of a hurdle to be a putoff.

I'd love to have all of those, but not at the expense of vi keybindings.

TomBener commented 3 years ago

The list is derived from http://www.w3.org/Math/characters/unicode.xml . The default way to avoid escaping is to turn off ascii mode for the translator. It's off by default for BibLaTeX, on by default for BibTeX.

Comments from 2020!

It seems ascii mode was off for both BibLaTeX and BibTeX. However, the special symbols like _ are still escaped for the export of Better BibTeX.

The screenshot below was edited at Preferences -> Advances -> Config Editor, and searched ascii.

CleanShot 2020-12-24 at 11 11 15@2x

Should I edit it here?

As I often use Pandoc to convert .md files to .docx files with the .bib files for citations. The escaped symbols like _ in the URL cannot be processed correctly by Pandoc like LaTeX, which makes the link rot in the exported .docx files.

As a result, how can I disable escaped symbol for exporting as Better BibTeX? Thank you 😊

retorquere commented 3 years ago

You can edit them there or in the prefs screen -- both achieve the same.

That ascii settings governs whether non-7-bit-ASCII characters get translated to TeX commands. Escaping of a few 7-bit ASCII still happens -- #$%&<>\\^_{}~ have special meaning in TeX outside verbatim mode, and nothing I could find in the biblatex manual says keywords lists are verbatim. You can't further prevent escaping. URL fields are verbatim though, so they should not see _ escaped. If URLs fields are escaped for you, please open a new issue.

TomBener commented 3 years ago

Sorry for my unclear description.

The URL field is indeed not escaped. The issue I encountered was the URL is in the howpublished field. For example:

@misc{huzhizhong2017,
  title = {逃不开的悬空村},
  author = {{胡志中}},
  year = {2017},
  howpublished = {http://news.cyol.com/content/2017-08/16/content\_16396645.htm},
  key = {hu zhi zhong},
  note = {2020-11-16}
}

The original link is http://news.cyol.com/content/2017-08/16/content_16396645.htm, while in the howpublished field, _ is escaped, which makes the link rot in the .docx file after the conversion by Pandoc.

As an alternative, I'd like to know if it is possible to change howpublished field to url field? By this way, the escaping issue would be avoided. Thanks!

retorquere commented 3 years ago

That example does not have an url field, it has an howpublished field, which isn't a verbatim field, so it gets escaping.

If you want the URL in the url field, you should either use biblatex, or enable the URL option for bibtex in the prefs.

retorquere commented 3 years ago

BTW if pandoc doesn't unescape the underscore in the howpublished field it's wrong. Unless you're using pandoc as a tex compiler (which just forwards everything to the actual latex stack, so the escaped underscores will work) you want to be using pandoc with Better CSL for much better results and none of this encoding hoopla.

TomBener commented 3 years ago

Thanks for your kind reply!

I checked it carefully again. Pandoc handles the howpublished field correctly except that the url is not a clickable link in the .docx file. What I mentioned the wrong escaping issue by Pandoc is the cause that I changed the howpublished field to url field manually. But the issue can be avoided by the sample code for BibTeX:

if (Translator.BetterBibTeX && item.itemType === 'webpage') {
    if (item.url) {
      reference.add({ name: 'howpublished', bibtex: "{\\url{" + reference.enc_verbatim({value: item.url}) + "}}" });
    }
  }

Thank you so much 😘

retorquere commented 3 years ago

That isn't necessary to do manually or by postscript. There's a setting in the prefs for URL handling for bibtex that will put the URL in an url field.

TomBener commented 3 years ago

I know it. But I don’t want to export URL for the general literature such as article or book. However, the URL for webpage is needed. It looks a little weird 😂

github-actions[bot] commented 3 years ago

This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.