Glossary for bookdown - Githubissues

ghost commented 8 years ago

Hi!

First of all, this "issue" is in fact a question.

Is there any way to use a glossary in the bookdown level so that all possible outputs (HTML, epub, pdf) have it?

At the moment I'm using inline code like

`r LatexOrOther("\\Acrlong{esa}","Electrical Signature Analysis")`

which forces me to write the entry of the glossary in the second argument.

I'm thinking of making an R function that can get the second argument from a glossary tex file for me. But before I put my hands at work I'd like to get some feedback from you just to check if there is any way to do this "cleanly", or perhaps if there is a better way.

The goal is to have a common syntax to refer a glossary file agnostic to all output formats (may that be a glossary tex file, or a json that could be parsed to a glossary tex file or pdf and any other format that I don't know for epub) so that I could just refer the acronym key - rather than what I'm doing.

Thanks in advance

yihui commented 8 years ago

Currently there is no way to generate the glossary for non-LaTeX output formats, so you are welcome to work on this, and submit a pull request. We certainly need to reinvent the wheel for non-LaTeX output, and that is the tricky part. Here are my quick thoughts:

store glossaries in a format that is easy to parse, such as JSON (I'd prefer this), XML, or YAML, but just not LaTeX;
like the existing \@ref(), I think \@Gsl(), \@gls(), etc, might be reasonable as the common syntax

If this turns out to be too much work, I'm okay if bookdown does not have this feature. I think the gain is relatively small. I'd just write a definition list instead: http://pandoc.org/MANUAL.html#definition-lists which is available to all output formats.

Here is one implementation for PDF output: https://yongfu.name/2018/10/24/glossary-maker.html

augustoamerico commented 8 years ago

Regarding this issue, I'd like to allow markdown syntax in the value of the JSON, like for example: json_glossary

Now that I'm trying to parse the JSON, I can't find a way to parse just the value string in-memory to (latex|html|epub). How would you do this?

yihui commented 8 years ago

You mean convert the strings to (latex/html/epub), right? This is a separate problem to which I don't have a solution, either. You can leave it aside for now and I'll think more about it.

augustoamerico commented 8 years ago

Yup that’s right. I will give a look to pandoc’s parser and check if it is feasible for me to make any kind of wrapper, even if it has to be in haskell. I will continue to make what I can to implement the feature, while “scheduling” this problem to a background task. When I’ve reached an “i’m all out of cards” point about converting a string, I will post something again here.

Thanks for the feedback Yihui!

Tiago dos Santos Msc Computer Science FCT/UNL https://www.tdsantos.com https://www.tdsantos.com/

On 19 Sep 2016, at 22:45, Yihui Xie notifications@github.com wrote:

You mean convert the strings to (latex/html/epub), right? This is a separate problem to which I don't have a solution, either. You can leave it aside for now and I'll think more about it.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/rstudio/bookdown/issues/199#issuecomment-248137037, or mute the thread https://github.com/notifications/unsubscribe-auth/ASdDOrgYof8ECsrC4KQBC2kMSCYkcEN8ks5qrwKEgaJpZM4J7kVx.

rainer-rq-koelle commented 7 years ago

Being relatively new to R/RStudio/RMarkdown/RBookdown, the discussion here is way above my head. Nonetheless, I think a glossary feature and/or list-of-abbreviations for RBookdown is an essential add on (including the availability of such a feature across the different output formats --> in particular MS Word as this thing is still around -- but that is another discussion).

I am not sure whether it helps the discussion, but these are my two cents:

I see "some" merit in creating an external glossary/definition file along the lines of an external reference file (e.g. bibtex). This serves reusability and ensures consistency across the body of work of (at least) the author making use of this external glossary file.
the issues may start, the moment one thinks about reusing the same term or abbreviation, but i assume this is a question of assigning unique labels/ids in the glossary file.
on the other side, A glossary/list of abbreviation lives inside a work (e.g. book, thesis) and the primary purpose is to shed light on the terms, definitions, and abbreviations inside the book.
in that respect, it is helpful to "load" / define the terms, definitions / abbreviations, as the author works, hence the idea of a function to define-while-you-go() will add to the workflow of authors. This approach suggests interlinked functions for possible queries on the existing terms/definitions, and/or updates to them.
seeing the cross-referencing of bookdown, the desired behaviour to call/link a glossary entry should follow along the lines of gls:cool-term-here and refer as Yihui suggested above (consistency of syntax supports its use / recollection by the author).
finally, the complete list should be "injected" when the document is build

I know too little to make the pandoc definition list work. But seeing the principle of how such a list is created, prompts me to ask how such a list or R object can be created/queried across the whole document and inserted at build time (in all output formats). I think adding a feature to extract and store such a list externally should not be the major issue. Which could then also be a means to have a sort of "starting" glossary/definition list loaded from an external file, if the author wishes to do so.

Keep up the good work, gents. I feel miserably as I have to admit that I have no idea how these requirements can be implemented, but I am positive that such user requirements are a good driver for augmenting the current set of features bookdown offers ... and it demonstrates that there is utility for such a feature. Thanks.

rchaput commented 2 years ago

Hello,

I wanted to achieve something similar to what is described in this issue (a list of acronyms rather than a glossary), and so I tried to implement a Pandoc Lua Filter, based on the Definition List idea.

Basically, my solution is to define the acronyms in the YAML metadata, and then to use \acr{key} in the document. The filter will replace the first occurrence by the acronym’s definition, and next occurrences by just its short name. (I think this may be easily extended to support glossaries).  A list of acronyms is also generated, and all acronyms are linked to their definition.

You can find the result in the following package: https://github.com/rchaput/acronymsdown/

I preferred to make a separate package rather than a pull request because I do not know enough of Pandoc / Knitr / RMarkdown / …, and I am pretty sure I missed some corner cases.  If you have feedback about that, I will happily try and improve my solution!

Also, if you want to integrate any part of my work into another system, I am fine with it. I believe that integrating it directly in bookdown would be easier for the user. I tried to remain agnostic w.r.t. the output format, but I could not find an elegant solution to automatically register my Lua Filter (for now, I use pandoc_args: !expr acronymsdown::add_filter() in the YAML metadata, where add_filter() is a function that returns —lua-filter /path/to/filter/in/package).

cderv commented 2 years ago

Thanks for sharing your solution. The way you are using it is the proper way. It can be more closely integrated by using a custom format but using pandoc_args if the correct way to use it to extend an existing format.

I take the opportunity to also share this work : https://github.com/yonicd/glossaries Maybe it could help with this features. It is not using a Lua filter but another approach.

rstudio / bookdown

Glossary for bookdown #199