zepinglee / citeproc-lua

A Lua implementation of the Citation Style Language (CSL)
MIT License
62 stars 7 forks source link

Modification of citeproc-bib.lua to support German umlaute #24

Closed uttiES closed 1 year ago

uttiES commented 1 year ago

Hey,

great project, thank you for sharing it! In order to use it for German references, I had to modify citeproc-bib.lua and thought you might be interested in my code as you left a "-- TODO: unicode chars like \"{o}" in bib.unescape. So here's my simple code inserted below that TODO: str = string.gsub(str, "\\"o", "ö") str = string.gsub(str, "\\"a", "ä") str = string.gsub(str, "\\"u", "ü") str = string.gsub(str, "\\"O", "Ö") str = string.gsub(str, "\\"A", "Ä") str = string.gsub(str, "\\"U", "Ü")

Additionally, I found that str = string.gsub(str, "\%%", "%") leads to this error for some files: "...mf-dist/scripts/citation-style-language/citeproc-bib.lua:158: invalid use of '%' in replacement string", where line 158 is str = string.gsub(str, "\%%", "%"). My quick fix was to simply comment it out, but I guess you can do better than me and maybe fix that issue? :)

Cheers

zepinglee commented 1 year ago

I've started working on this but it's more complicated than I expected because all patterns like \"o, \"{o}, {\"o}, {\"{o}} should be converted to ö and there is a full list of symbols in https://github.com/latex3/latex2e/blob/develop/base/utf8ienc.dtx. I'm rewriting the whole BibTeX parser with lpeg to solve this issue.

This percent issue is a bug and it should be corrected: str = string.gsub(str, "\\%%", "%%").

zepinglee commented 1 year ago

At the moment, the best practice is using Zotero with Better BibTeX (BBT) plugin: import the .bib database into Zotero and then export to CSL-JSON format for use with citeproc-lua. The BBT plugin is robust in .bib conversion.

zepinglee commented 1 year ago

@uttiES I've rewritten the BibTeX parser with lpeg (c720941) and it should now convert LaTeX commands like \"o to Unicode correctly. Can you provide your .bib data for further testing?

uttiES commented 1 year ago

@uttiES I've rewritten the BibTeX parser with lpeg (c720941) and it should now convert LaTeX commands like \"o to Unicode correctly. Can you provide your .bib data for further testing?

Oh so nice! Here's a short example exported from Zotero:

Example.zip

zepinglee commented 1 year ago

I've added it to test/bibtex/issue-24.bib and it's converted to CSL-JSON correctly (test/bibtex/issue-24.json).