Juris-M / citeproc-js

A JavaScript implementation of the Citation Style Language (CSL) https://citeproc-js.readthedocs.io
Other
304 stars 84 forks source link

Tagged Output #72

Closed dominic01 closed 5 years ago

dominic01 commented 6 years ago

Is it possible to get a tagged output in addition to the styles. input:

{
  "Item-2": {
    "id": "Item-2",
    "type": "book",
    "title": "Mastering Regular Expressions",
    "publisher": "O'Reilly Media",
    "number-of-pages": "544",
    "edition": "3",
    "source": "Amazon.com",
    "ISBN": "0596528124",
    "author": [
      {
        "family": "Friedl",
        "given": "Jeffrey E. F."
      }
    ],
    "issued": {
      "date-parts": [
        [ "2006", 8, 15 ]
      ]
    }
  }
}

Output Bibliography: APA Friedl, J. E. F. (2006). Mastering Regular Expressions (3rd ed.). O’Reilly Media.

I am expecting a tagged output like: <aus><au>Friedl, J. E. F.</au></aus>. (<dt>2006</dt>). <ti>Mastering Regular Expressions</ti> (<et>3rd ed.</et>). <pu>O’Reilly Media</pu>.

Is it possible?

michel-kraemer commented 6 years ago

Yes, this is possible. You have to create your own style as follows.

First of all, you have to find a style that is closest to what you want to achieve. In your case, it seems to be APA. Download the original style file from https://github.com/citation-style-language/styles/blob/9296c8c/apa.csl and amend it with the tags you need. For example, from line 78 to 96 you'll find the macro for the author. Use the prefix and suffix attributes to add your <au> and </au> tags respectively. Do the same for the other macros and tags.

After that, pass the style as a string to the CSL.Engine constructor:

var citeproc = new CSL.Engine(citeprocSys, yourNewStyleAsText);
dominic01 commented 6 years ago

Appreciate your response and this is exactly what I did in my test environment. This is working and thank you. However, I am looking for a more generic way of getting the tags. That is using the JSON element names as tags like , , <publisher> etc. This way I need not change the CSL, but I'll modify the code so that it will provide a generic tagged output.</p> <p>I was trying to play around with the src/formatters.js and somehow it didnt work. Any pointers would be of great help.</p> </div> </div> <div class="comment"> <div class="user"> <a rel="noreferrer nofollow" target="_blank" href="https://github.com/mshd"><img src="https://avatars.githubusercontent.com/u/17379661?v=4" />mshd</a> commented <strong> 6 years ago</strong> </div> <div class="markdown-body"> <p>Hello Michel, Hello Dominic, I'm working on the same project to create a dataset for GROBID so I will need annotated outputs (see here <a href="http://grobid.readthedocs.io/en/latest/training/Bibliographical-references/">http://grobid.readthedocs.io/en/latest/training/Bibliographical-references/</a>) @dominic01 Have you modified the code already? Would you like to share it? Changing the JSON elements will work for the title however not for date and other abbreviated names.</p> <p>@michel-kraemer I've considered your idea to change the CSL files, however changing all 1700 files will prove rather difficult, is there any other possibility? If not, (certainly not ideal) I will have to change the code. Could you point me to the necessary lines?</p> </div> </div> <div class="comment"> <div class="user"> <a rel="noreferrer nofollow" target="_blank" href="https://github.com/fbennett"><img src="https://avatars.githubusercontent.com/u/75338?v=4" />fbennett</a> commented <strong> 6 years ago</strong> </div> <div class="markdown-body"> <p>This can be done with hooks in the processor. It has been a very long time since anyone asked about it, and I will need to dig into the code to figure out the incantations. But definitely possible.</p> </div> </div> <div class="comment"> <div class="user"> <a rel="noreferrer nofollow" target="_blank" href="https://github.com/mshd"><img src="https://avatars.githubusercontent.com/u/17379661?v=4" />mshd</a> commented <strong> 6 years ago</strong> </div> <div class="markdown-body"> <p>Thanks for the quick reply, let me know when you find it...</p> </div> </div> <div class="comment"> <div class="user"> <a rel="noreferrer nofollow" target="_blank" href="https://github.com/mshd"><img src="https://avatars.githubusercontent.com/u/17379661?v=4" />mshd</a> commented <strong> 6 years ago</strong> </div> <div class="markdown-body"> <p>Dear @fbennett , have you found the part in the script "citeproc_commonjs.js" where I can add the annotated tags?</p> </div> </div> <div class="comment"> <div class="user"> <a rel="noreferrer nofollow" target="_blank" href="https://github.com/dominic01"><img src="https://avatars.githubusercontent.com/u/6888834?v=4" />dominic01</a> commented <strong> 6 years ago</strong> </div> <div class="markdown-body"> <blockquote> <blockquote> <p>@dominic01 Have you modified the code already? Would you like to share it? Changing the JSON elements will work for the title however not for date and other abbreviated names.</p> </blockquote> </blockquote> <p>I tried few sample tags and it worked. But didn't progress further because of too many files need to be changes. I was expecting a simple solution that will put the JSON field names as tags in the output file. <a href="https://github.com/Juris-M/citeproc-js/files/2273986/apa-tag.txt">apa-tag.txt</a></p> </div> </div> <div class="comment"> <div class="user"> <a rel="noreferrer nofollow" target="_blank" href="https://github.com/mshd"><img src="https://avatars.githubusercontent.com/u/17379661?v=4" />mshd</a> commented <strong> 6 years ago</strong> </div> <div class="markdown-body"> <p>Hey @dominic01 I put your idea into practice so I wrote a script to change all .csl files, feel free to use it. Using nodejs/javascript <code>styleString = styleString.replace(/<text variable="(.*?)"( prefix="([^"]*)")?( suffix="([^"]*)")?/g, '<text variable="$1" prefix="$3&lt;$1&gt;" suffix="&lt;/$1&gt;$5"');</code> <code>styleString = styleString.replace(/<date variable="(.*?)"( prefix="([^"]*)")?( suffix="([^"]*)")?/g, '<date variable="$1" prefix="$3&lt;$1&gt;" suffix="&lt;/$1&gt;$5"');</code> <code>styleString = styleString.replace(/<names variable="(.*?)"( prefix="([^"]*)")?( suffix="([^"]*)")?/g, '<names variable="$1" prefix="$3&lt;$1&gt;" suffix="&lt;/$1&gt;$5"');</code></p> <p>To disable sorting please use: <code>styleString = styleString.replace(/<sort>([\s\S]*?)<\/sort>/g,'');</code></p> </div> </div> <div class="comment"> <div class="user"> <a rel="noreferrer nofollow" target="_blank" href="https://github.com/fbennett"><img src="https://avatars.githubusercontent.com/u/75338?v=4" />fbennett</a> commented <strong> 5 years ago</strong> </div> <div class="markdown-body"> <p>I'm very late with a response here, but in case the question comes up again ...</p> <p>Enabling variable wrapper hooks in the processor is one way to do this, but it's not been documented, and would require study and testing. For simple tags to the source variables begin the output, though, you can just set the <code>csl_reverse_lookup_support</code> flag on the processor. It doesn't work dynamically, you need to modify the setting in the source before instantiating with a style. The line to change is here:</p> <p><a href="https://github.com/Juris-M/citeproc-js/blob/master/src/state.js#L152">https://github.com/Juris-M/citeproc-js/blob/master/src/state.js#L152</a></p> <p>That setting is what is used in the CSL Style Editor to show where each piece of output comes from in the style and vice-versa. The source line in the editor where the change is applied is here:</p> <p><a href="https://github.com/citation-style-language/csl-editor/blob/bf5d06770490c2ef86cd1a928e8dfa699de5fb1b/src/citationEngine.js#L66">https://github.com/citation-style-language/csl-editor/blob/bf5d06770490c2ef86cd1a928e8dfa699de5fb1b/src/citationEngine.js#L66</a></p> <p>Frank</p> </div> </div> <div class="page-bar-simple"> </div> <div class="footer"> <ul class="body"> <li>© <script> document.write(new Date().getFullYear()) </script> Githubissues.</li> <li>Githubissues is a development platform for aggregating issues.</li> </ul> </div> <script src="https://cdn.jsdelivr.net/npm/jquery@3.5.1/dist/jquery.min.js"></script> <script src="/githubissues/assets/js.js"></script> <script src="/githubissues/assets/markdown.js"></script> <script src="https://cdn.jsdelivr.net/gh/highlightjs/cdn-release@11.4.0/build/highlight.min.js"></script> <script src="https://cdn.jsdelivr.net/gh/highlightjs/cdn-release@11.4.0/build/languages/go.min.js"></script> <script> hljs.highlightAll(); </script> </body> </html>