fiduswriter / biblatex-csl-converter

A set of JavaScript converters: bib(la)tex => json, json => csl, and json => biblatex
GNU Lesser General Public License v3.0
34 stars 10 forks source link

\url is parsed to ̆rl #38

Closed retorquere closed 7 years ago

retorquere commented 7 years ago

\url should invoke verbatim parsing if I remember correctly.

johanneswilm commented 7 years ago

What will you do with the links when converting to CSL? What fields should URLs be allowed in?

johanneswilm commented 7 years ago

The replacement in itself is not so problematic, I think -- one could simply add another replacement that replaces all \url with \hyperref (or random other string) first, and then replace them back with \url after \u replacement has taken place.

What is more problematic is that the contents of that url will also have experienced tex char replacement. I am not sure to what extend that is ok, or whether we maybe need to store those original contents first (the ~-issue should be easy to solve, but if no char replacement is allowed at all within the url, it's more complicated).

retorquere commented 7 years ago

I don't know any field but note or abstract that would do something relevant with URLs. I don't think we'd need to do anything special with URLs. Its just that inside \url, parsing is different. _ doesn't mean underscore for example, it's just an underscore.

johanneswilm commented 7 years ago

Yes, googling around, it sems that usage of \url in bibliographies is not well-handled generally, ebcause it mostly doesn't work like in the text. Ok, I'm on this for now.

johanneswilm commented 7 years ago

What about an url with a } inside of it? Are characters other than ~ and _ still escaped?

retorquere commented 7 years ago

To my recollection those would be OK (and would just mean literal braces), and unbalanced braces in verbatim mode would break biblatex.

johanneswilm commented 7 years ago

so a url like:

\url{http://www.blabla.com/hand\}2/index.html}

would be ok?

retorquere commented 7 years ago

Nope, that would break

johanneswilm commented 7 years ago

So what would one do?

retorquere commented 7 years ago

Panic and bail like biblatex does. This isn't valid input. I'd say flag it as an error. It's one thing to have semantic quibbles about what something means, but this simply isn't parsable. \url{http://www.blabla.com/hand{}2/index.html} is fine though and would just have those braces in the output.

johanneswilm commented 7 years ago

Uh, just saw your last comment

retorquere commented 7 years ago

Oh, anything that isn't wildly wrong won't be worse than what biblatex does with this. I wouldn't worry about it. What does the parser do now when confronted with that?

I'd be more concerned with what you're going to output when that sequence occurs. \vphantom or \rbrace doesn't work in verbatim mode, so there really just isn't a way to output that in a way that will render.

johanneswilm commented 7 years ago

OK, my work is done on this url stuff. it may not be perfect though, and you are welcome to find another solution. Basically it first renames \url to \XXurl, then replaces \u with ̆, then replaces \XXurl with \url. It also handles \url, but differently than all the other commands, in that it doesn't permit styling inside of it and it doesn't escape the characters inside of it. when exporting to csl, it simply outputs it as plaintext. When outputting biblatex, it doesn't texescape the contents of it.

This is better than what there was before, but I think in FW, I will simply disallow using \urls at all, as it ust seems to create a mess.

Again, feel free to figure out some other way to handle this.

retorquere commented 7 years ago

Given the current parser setup, this seems the best approach.