retorquere / zotero-better-bibtex

Make Zotero effective for us LaTeX holdouts
https://retorque.re/zotero-better-bibtex/
MIT License
5k stars 277 forks source link

Dashes to \hyphen #612

Closed dbankmann closed 6 years ago

dbankmann commented 7 years ago

Hey,

I used my zotero database for export with BibLatex. However, now I have to export to bibtex and run into troubles, because my titles contain<pre>\hyphen</pre> tags which is only supported by biblatex.

It would be nice, if you could specify, whether a certain tag should be used for export with biblatex or bibtex.

Or even better, for this specific case, it would be nice to have an option, to tell the export function to transform dashes to<pre>\hyphen</pre> whenever using biblatex.

retorquere commented 7 years ago

how do you currently generate these? AFAICT BBT doesn't generate \hyphen

dbankmann commented 7 years ago

Oops. I'm using the <pre> tags which got cropped by the github editor in my last post.

retorquere commented 7 years ago

Can you submit a reference which has a hyphen you'd want to transform? You can submit a reference by right-clicking one in your zotero library and selecting "Report BBT error"

dbankmann commented 7 years ago

Sure: AFFJM57E

retorquere commented 7 years ago

Ah, I thought you had a reference with a unicode hyphen like ‐ or ‑. Re-interpreting stuff in <pre> tags isn't something I can do easily, the whole point of them being that whatever is in there gets passed through without BBT touching it.

You could try to achieve what you want with a postscript, something like

if (this.has.title && Translator.BetterBibTeX) {
    this.has.title.bibtex = this.has.title.bibtex.replace('<pre>\\hyphen</pre>', '-')
}
dbankmann commented 7 years ago

That should work, thanks! However, I was rather suggesting not to parse stuff in <pre> tags, instead parse hyphens and replace them with <pre>\hyphen </pre> when exporting with biblatex.

So I, as the user, do not have to mess around with these tags and still get everything I need.

retorquere commented 7 years ago

That's why I asked for a reference with a hyphen in it πŸ˜„ . With that I can maybe cook something up. I'm not sure I should be changing regular dash to \hyphen everywhere, but there are dedicated hyphen characters (like ‐ or ‑ -- see more here) which I could consider translating in a context-specific way.

dbankmann commented 7 years ago

Oh :D. That makes sense! So I could use U+2010 and U+2011 in my titles, etc. and they would translate to ASCII hyphen for bibtex and the appropriate hyphens for biblatex.

retorquere commented 7 years ago

Not currently, but I can see if I can make that happen.

retorquere commented 7 years ago

What do you want to achieve BTW? Is there a reason to prefer \hyphen over \emdash or \endash?

retorquere commented 7 years ago

And why doesn't BibTeX support \hyphen? Do you have a minimal working (well, broken in this case I guess) example for me? I thought \hyphen was just LaTeX, nothing Bib(La)TeX specific.

dbankmann commented 7 years ago

I'd like to have working hyphenation for my bib entries;). I have lots of works with compound words in their title like differential-algebraic. Withouth the hyphen command latex won't apply any hyphenation for these words.

And no, actually \hyphen is biblatex related. See http://tug.ctan.org/macros/latex/exptl/biblatex/doc/biblatex.pdf p. 116 .

 An explicit, breakable hyphen intended for compound words. In contrast to a literal
β€˜-’, this command allows hyphenation in the rest of the word. It is similar to the
"= shorthand provided by some language modules of the babel/polyglossia
packages.

Solutions for bibtex exist, they are much more cumbersome though, e.g. http://www.latex-community.org/forum/viewtopic.php?f=50&t=3584 since in this case you need to declare a new command in your main tex-file.

retorquere commented 7 years ago

It is possible for BBT to declare new commands in the @preamble (which I already to do for \noopsort), so that's not an insurmountable problem. It would for example be possible to translate U+2010 to \hyphen, and add something to the @preamble for BetterBibTeX export to define it, assuming a sensible command can be constructed. I don't quite follow what goes on in the last thread you mentioned; I am (such as it is) a better javascript than a LaTeX programmer.

dbankmann commented 7 years ago

I'm not that familiar with bibtex and latex either;) However, that sounds like a solution to me! I personally, wouldn't care that much about bibtex at that point. If the journal doesn't support biblatex it shouldn't be my fault. I just don't want to have two different sets of bibliographies for biblatex export and bibtex;)

retorquere commented 7 years ago

I do have to care about bibtex though. @njbart, any ideas on the matter?

retorquere commented 7 years ago

@dbacc after reading through that BibTeX discussion you linked through, I'm not sure I can safely translate any and all dashes to \hyphen (for biblatex) or {\-} (for bibtex). I'm looking at specific hyphen chars that I could do this for, and I see a few candidates here, but I'm lost as to what these dashes all mean. It seems pretty clear that a soft hyphen (00AD/058A/1806) would be good candidates, but I don't know what distinguishes a hyphen like 2010 from a soft-hyphen, semantically, or how I should interpret all the rest.

njbart commented 7 years ago

Since you asked: I think the \hyphen command from biblatex is only intended to replace the ordinary hyphen (a.k.a. hyphen-minus, U+002D) in compound words. This has nothing to do with soft hyphens.

For biblatex, BBT could in principle replace all U+002D chars in compound words by \hyphenΒ  (note the trailing space) upon export, to improve hyphenation of the rest of the compound word – provided you feel this is worth the effort.

For bibtex, U+002D chars should be left alone (and most certainly not converted to \- soft hyphens.

retorquere commented 7 years ago

What counts as an compound word? <letter><002D><letter>?

njbart commented 7 years ago

I guess so.

(Note that the replacement is pointless if neither of the words on either side of the hyphen can be hyphenated itself. I realise that BBT can hardly check for this per se, but maybe it'd be worth allowing the replacement only if a word on either side of the hyphen contains three or more letters?)

retorquere commented 7 years ago

That's going to be nigh impossible to get right. The BBT character translation works on a char-by-char basis and doesn't take any context into consideration. There is a phase in this process where I have access to chunks of the input text but those chunks are broken up by markup. So alternative-<ending> would be seen as two chunks alternative-/ending and would not be detectable as compound word. All other cases should work... I guess. It can't do a better job than a postscript.

github-actions[bot] commented 3 years ago

This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.