fiduswriter / biblatex-csl-converter

A set of JavaScript converters: bib(la)tex => json, json => csl, and json => biblatex
GNU Lesser General Public License v3.0
34 stars 10 forks source link

\# in url field means # rather than \# #67

Closed retorquere closed 7 years ago

retorquere commented 7 years ago
@Article{ck1,
  Url = {http://www.ip-sl.org/procs/abs07.html\#talk9}
}

renders the url as http://www.ip-sl.org/procs/abs07.html#talk9; currently the parser yields http://www.ip-sl.org/procs/abs07.html\#talk9

johanneswilm commented 7 years ago

Checking with biblatex, that outputs as

url: http://www.ip-sl.org/procs/abs07.html%5C#talk9.

The Url field is a verbatim + url-escaped field. Why would one insert a \ there?

retorquere commented 7 years ago

I can't say generally (I just found this in the references in my test set), but if I remove the \ in the MWE I get

You can't use `macro parameter character #' in horizontal mode.
johanneswilm commented 7 years ago

I see you are trying to do it in the sharelatex file. But I wonder: does it really work 100% the same if you use a proper bibtex file and if you create the bibtex file from a latex file the way you do there?

retorquere commented 7 years ago

I'll give that a go right now, but Overleaf shows the same behaviour.

johanneswilm commented 7 years ago

Overleaf/Sharelatex have merged, AFAIK.

johanneswilm commented 7 years ago

Hmm, yes, I get the same result you get with overleaf/sharelatex with the good old natbib.

So basically biblatex and bibtex are incompatible in that regard.

johanneswilm commented 7 years ago

The same goes for % and I guess all other special latex characters.

retorquere commented 7 years ago

I tried this

% arara: pdflatex
% arara: bibtex
% arara: pdflatex
% arara: pdflatex
% arara: lmkclean

\documentclass{article}

\usepackage{filecontents}
\usepackage[T1]{fontenc}

\begin{filecontents}{\jobname.bib}
@Article{ck1,
  Url = {http://www.ip-sl.org/procs/abs07.html\#talk9}
}
\end{filecontents}

\begin{document}

 \cite{ck1}

  \H{o} "long Hungarian umlaut (double acute)"
\bibliographystyle{IEEEtran}
\bibliography{\jobname}
\end{document}

and ran it using arara and I git the same behavior -- # errors out, \# does what I expected.

Oh great, so bibtex and biblatex are at odds here. Well I could post-process all \s out of URLs in the assumption noone would want them there.

johanneswilm commented 7 years ago

Oh great, so bibtex and biblatex are at odds here. Well I could post-process all \s out of URLs in the assumption noone would want them there.

That is the question. But it seems that Chrome turns any \ inside a URL into a /, so the web pages that really do have a \ in their address will be fairly limited (and they won't get many visitors). So based on that, yes, it seems rather safe for us to remove all \-characters from URLs.

retorquere commented 7 years ago

Yep, you're right, running this will get the %5C

% arara: pdflatex
% arara: biber
% arara: pdflatex
% arara: pdflatex
% arara: lmkclean

\documentclass{article}

\usepackage{filecontents}
\usepackage[T1]{fontenc}
\usepackage[backend=biber,style=ieee]{biblatex}

\begin{filecontents}{\jobname.bib}
@Article{ck1,
  Url = {http://www.ip-sl.org/procs/abs07.html\#talk9}
}
\end{filecontents}
\addbibresource{\jobname.bib}

\begin{document}

 \cite{ck1}

\printbibliography
\end{document}
johanneswilm commented 7 years ago

Firefox uses %5C.

retorquere commented 7 years ago

Yeah well people should just not use backslashes in their URLs. Bugger MS for ever getting that annoyance in place.

retorquere commented 7 years ago

Oh hey you're already removing them in the parse phase. Cool, I was planning to post-process them.

johanneswilm commented 7 years ago

I checked whether there were URLs that had backslashes in them, and then came across the Chrome issue, which effectively means that it's not possible currently to have such URLs [1]. Therefore I thought it cannot do much harm to remove them.

[1] https://stackoverflow.com/questions/10438008/different-behaviours-of-treating-backslash-in-the-url-by-firefox-and-chrome

retorquere commented 7 years ago

Nice, that. Bad Chrome.