Juris-M / citeproc-js

A JavaScript implementation of the Citation Style Language (CSL) https://citeproc-js.readthedocs.io
Other
308 stars 85 forks source link

Locale seems to always default to `en-US`, even when supplied with a "locale getter" function #4

Closed dsifford closed 8 years ago

dsifford commented 8 years ago

Hi there.

I can't seem to get any other locale other than en-US. To my knowledge, I'm supplying the CSL.engine with the correct functions to retrieve the locale, but it just isn't happening.

Here is the relevant code from my repo:


declare var CSL;

export class CSLPreprocessor {

    public citeprocSys: Citeproc.SystemObj;
    public citations: { [id: string]: CSL.Data };

    /**
     * Constructor for the CSLPreprossor class.
     * @param  {string}  locale  The current user's locale string.
     * @param  {{[id: string]:CSL.Data}}  citations  An object of CSL.Data.
     *   Note, the object key MUST match the `id` param within the object.
     * @param  {string}   style  The user's selected style string.
     * @param  {Function} callback  Callback function
     * @return {void}
     */
    constructor(locale: string, citations: { [id: string]: CSL.Data }, style: string, callback: Function) {

        this.citations = citations;

        let p1 = new Promise((resolve, reject) => {
            this.getLocale(locale, (data: string) => {
                resolve({
                    retrieveLocale: (lang) => data,
                    retrieveItem: (id: string | number) => this.citations[id],
                });
            });
        });

        p1.then((data: Citeproc.SystemObj) => {
            this.citeprocSys = data;
            this.getProcessor(style, data, callback);
        });

    }

    /**
     * Retrieves the locale rules for CSL using HTTP and passes it to a callback function.
     * @param {string}   locale   The user's locale.
     * @param {Function} callback Callback function.
     */
    getLocale(locale: string, callback: Function): void {
        let req = new XMLHttpRequest();

        let cslLocale = this.locales[locale];
        if (typeof cslLocale === 'boolean') {
            locale = 'en-US';
        }
        else {
            locale = cslLocale;
        }

        req.onreadystatechange = () => {
            if (req.readyState === 4) {
                callback(req.responseText);
            }
        };

        req.open("GET", `https://raw.githubusercontent.com/citation-style-language/locales/8c976408d3cb287d0cecb29f97752ec3a28db9e5/locales-${locale}.xml`);
        req.send(null);
    }

    /**
     * Retrieves the CSL style rules for the selected style using HTTP. When the
     *   style instructions are received, a CSL.Engine is created and passed to
     *   the callback function.
     * @param  {string}   styleID  The style ID for the style of interest (no .csl extension)
     * @param  {Function} callback Callback function.
     */
    getProcessor(styleID: string, data: Citeproc.SystemObj, callback: Function): void {
        let req = new XMLHttpRequest();
        req.open('GET', `https://raw.githubusercontent.com/citation-style-language/styles/master/${styleID}.csl`);

        req.onreadystatechange = () => {
            if (req.readyState === 4) {
                let citeproc = new CSL.Engine(data, req.responseText);
                callback(citeproc);
            }
        }

        req.send(null);
    }

    /**
     * Receives the response object from `getProcessor`, makes the bibliography,
     *   removes outer HTML, pushes it to an array, and returns the array.
     * @param  {Object}   citeproc The citeproc engine.
     * @return {string[]}          Array of citations to be served.
     */
    prepare(citeproc): string[] {
        citeproc.updateItems(Object.keys(this.citations));
        let bib = citeproc.makeBibliography();

        let data = [];
        bib[1].forEach(ref => {
            data.push(this.trimHTML(ref));
        });
        return data;
    }

    /**
     * Removes outer HTML formatting served from citeproc, sparing inner `<i>` tags.
     * @param  {string} ref The reference payload from citeproc.
     * @return {string}     A formatted reference string without outer HTML.
     */
    trimHTML(ref: string): string {
        return ref
            .replace(/<(?!(i|\/i|a|\/a)).+?>/g, '')
            .trim()
            .replace(/^\d+\.\s?/, '');
    }

    /**
     * This object converts the locale names in wordpress (keys) to the locales
     *   in CSL (values). If CSL doesn't have a locale for a given WordPress locale,
     *   then false is used (which will default to en-US).
     */
    private locales: {[wp: string]:string|boolean} = {
        'af': 'af-ZA',
        'ak': false,
        'am': false,
        'ar': 'ar',
        'arq': 'ar',
        'art_xemoji': 'ar',
        'ary': 'ar',
        'as': 'en-US',
        'az_TR': 'tr-TR',
        'az': 'tr-TR',
        'azb': 'en-US',
        'ba': false,
        'bal': false,
        'bcc': false,
        'bel': false,
        'bg_BG': 'bg-BG',
        'bn_BD': 'en-US',
        'bo': false,
        'bre': false,
        'bs_BA': false,
        'ca': 'ca-AD',
        'ceb': false,
        'ckb': false,
        'co': false,
        'cs_CZ': 'cs-CZ',
        'cy': 'cy-GB',
        'da_DK': 'da-DK',
        'de_CH': 'de-CH',
        'de_DE': 'de-DE',
        'dv': false,
        'dzo': false,
        'el': 'el-GR',
        'en_AU': 'en-US',
        'en_CA': 'en-US',
        'en_GB': 'en-GB',
        'en_NZ': 'en-US',
        'en_US': 'en-US',
        'en_ZA': 'en-US',
        'eo': false,
        'es_AR': 'es-ES',
        'es_CL': 'es-CL',
        'es_CO': 'es-CL',
        'es_ES': 'es-ES',
        'es_GT': 'es-ES',
        'es_MX': 'es-MX',
        'es_PE': 'es-CL',
        'es_PR': 'es-CL',
        'es_VE': 'es-CL',
        'et': 'et-ET',
        'eu': 'eu',
        'fa_AF': 'fa-IR',
        'fa_IR': 'fa-IR',
        'fi': 'fi-FI',
        'fo': false,
        'fr_BE': 'fr-FR',
        'fr_CA': 'fr-CA',
        'fr_FR': 'fr-FR',
        'frp': false,
        'fuc': false,
        'fur': false,
        'fy': false,
        'ga': false,
        'gd': false,
        'gl_ES': false,
        'gn': false,
        'gsw': 'de-CH',
        'gu': false,
        'haw_US': 'en-US',
        'haz': false,
        'he_IL': 'he-IL',
        'hi_IN': false,
        'hr': 'hr-HR',
        'hu_HU': 'hu-HU',
        'hy': false,
        'id_ID': 'id-ID',
        'ido': false,
        'is_IS': 'is-IS',
        'it_IT': 'it-IT',
        'ja': 'ja-JP',
        'jv_ID': false,
        'ka_GE': false,
        'kab': false,
        'kal': false,
        'kin': false,
        'kk': false,
        'km': 'km-KH',
        'kn': false,
        'ko_KR': 'ko-KR',
        'ky_KY': false,
        'lb_LU': 'lt-LT',
        'li': false,
        'lin': false,
        'lo': false,
        'lt_LT': 'lt-LT',
        'lv': 'lv-LV',
        'me_ME': false,
        'mg_MG': false,
        'mk_MK': false,
        'ml_IN': false,
        'mn': 'mn-MN',
        'mr': false,
        'mri': false,
        'ms_MY': false,
        'my_MM': false,
        'nb_NO': 'nb-NO',
        'ne_NP': false,
        'nl_BE': 'nl-NL',
        'nl_NL': 'nl-NL',
        'nn_NO': 'nn-NO',
        'oci': false,
        'ory': false,
        'os': false,
        'pa_IN': false,
        'pl_PL': 'pl-PL',
        'ps': false,
        'pt_BR': 'pt-PR',
        'pt_PT': 'pt-PT',
        'rhg': false,
        'ro_RO': 'ro-RO',
        'roh': false,
        'ru_RU': 'ru-RU',
        'rue': false,
        'rup_MK': false,
        'sa_IN': false,
        'sah': false,
        'si_LK': false,
        'sk_SK': 'sk-SK',
        'sl_SI': 'sl-SI',
        'snd': false,
        'so_SO': false,
        'sq': false,
        'sr_RS': 'sr-RS',
        'srd': false,
        'su_ID': false,
        'sv_SE': 'sv-SE',
        'sw': false,
        'szl': false,
        'ta_IN': false,
        'ta_LK': false,
        'tah': false,
        'te': false,
        'tg': false,
        'th': 'th-TH',
        'tir': false,
        'tl': false,
        'tr_TR': 'tr-TR',
        'tt_RU': false,
        'tuk': false,
        'twd': false,
        'tzm': false,
        'ug_CN': false,
        'uk': 'uk-UA',
        'ur': false,
        'uz_UZ': false,
        'vi': 'vi-VN',
        'wa': false,
        'xmf': false,
        'yor': false,
        'zh_CN': 'zh-CN',
        'zh_HK': 'zh-CN',
        'zh_TW': 'zh-TW',
    }

}

Any tips would be much appreciated. Thank you in advance!

dsifford commented 8 years ago

The only thing I can think of here is that I technically don't supply CSL.engine with precisely the "getter" function that the docs describe.

Instead, I get the correct locale XML one time prior to creating a new CSL.engine and my retrieveLocale is basically a wrapper that sends the raw response data to it that I already retrieved.

I have no idea why this wouldn't work.

fbennett commented 8 years ago

Yeah, I've been looking at the arrow syntax, which I haven't used at this level, so I was spinning my wheels a bit trying to get my head around it. With a non-English locale, the processor will make multiple calls on retrieveLocale(), first to obtain the ground-default en-US data, then again to obtain the terms for the explicit default locale (if any), and again for terms defined inside the style. I'm not sure why you're getting exclusively en-US terms, but it the function is bound to a particular locale, that won't work correctly.

dsifford commented 8 years ago

Thanks for the reply @fbennett and kudos for your work on this (the amount of work you've done here is unbelievable)...

To clarify: each time CSL calls my retrieveLocale function, it gets the locale XML from your locales repo. I've confirmed by tracing the call stack that it is the correct XML file (for example, th-TH) but for some reason, somewhere down the line it defaults back to en-US. I can perhaps take a screen recording as I step through the call stack if you think that might be helpful.

fbennett commented 8 years ago

Thanks for your kind words - it's great to see citerproc-js put to good use.

The processor will call the retrieveLocale() function repeatedly, first with 'en-US' as argument, then with the language tag of the default-locale of the style (such as 'th-TH'). The function should return the serialized XML of the corresponding locale file (assuming that the style is supplied to the processor in that form as well).

I can follow JavaScript reasonably well, but your code is written in some other syntax that I can follow only loosely (from a quick look around with Google, it might be typescript?). Looking at the definition of retrieveLocale(), it seems that it might not be able to fetch and return arbitrary locale data (that is, it looks as though "data" in the getLocale() function might behave something like a closure) - but that's said without any knowledge of the language of the code, so I'm most likely wrong there.

If it might be helpful for me to look through a stack trace, I'd be happy to do that - if the fault lies in the processor itself, I would certainly like to track down the cause.

fbennett commented 8 years ago

There are robust functions for inline citations, that support a couple of possible approaches.

In general, for locales and for cited items, the processor code assumes that retrieve(Locale|Item)() can fetch the target by ID from the current context. In Web environments, that most often means keeping a global hash object for each in the processor context, where the functions can get ahold of them.

In word processors, we use updateItems() with processCitationCluster(). The former takes a list of item IDs and registers them in the processor. The latter accepts a full citation object with an ID, a list of citation-object predecessor and successor IDs with their note numbers, and returns an object containing a list of two-element tuples (citation string and citation index, IIRC) that require updating. Simple modifications to an existing cite will return only a single tuple, but if an operation impacts other citations (i.e. an insert, delete, or modification that affects note number or disambiguated citation text), the affected citations will also be returned. The calling application then needs to place the affected citations in their document position.

The dance described above strives for efficiency, minimizing the volume of changes needed in the document at each edit. To simplify the coding, many projects use a shortcut method, updating the full citation list with rebuildProcessorState(), which takes as argument a list of citation data objects (at least), and optionally a mode (normally 'html' for the Web), and a list of uncited items to be included in the bibliography only. I believe that Docear uses this method for citation updates.

There is some code in the processor for setting a slug in place of citation-number in numbered styles, added to support processing in Mendeley early in the development of the processor. I'm not sure whether that's still used; I think it shouldn't be necessary, but there are lots of operational requirements out there that I haven't thought of.

The starting point for working out a strategy for inline citations would be the state of the document immediately after a citation is added or edited. If there is a suitable set of tag wrappers and IDs in the document, it should be possible to use processCitationCluster(). That approach is nice because it should scale well to large-ish documents (but on the other hand it is significantly more complex to code and debug).

dsifford commented 8 years ago

your code is written in some other syntax that I can follow only loosely (from a quick look around with Google, it might be typescript?).

Yep! Sorry for not mentioning that. (As an aside, TypeScript is life-changing. You should check it out if you ever get bored).

might not be able to fetch and return arbitrary locale data (that is, it looks as though "data" in the getLocale() function might behave something like a closure)

I'm 98% positive that the data payload is accessible during all points that citeproc needs it. I did run into an issue where my async request lagged behind the CSL.engine call (and it received an undefined variable), but I was able to fix that using promises.

One thing I suppose I should mention is: In your example you use XMLHttpRequest.open('GET', <some-url>, false) (where the false implies a synchronous request. That's no longer possible in all browsers, and will soon be going away completely (described here: https://developer.mozilla.org/en-US/docs/Web/API/XMLHttpRequest#open() )

In Web environments, that most often means keeping a global hash object for each in the processor context, where the functions can get ahold of them.

Hmm, yeah I might not be able to use citeproc to its full potential then. I'm leveraging it in WordPress which makes things a bit more difficult (the most difficult being, I'm pretty much forced to be stateless. I have no access to the users' databases using ajax). It's probably possible to get working how you describe, but for now I think I'll just keep trucking as I currently am until I get the majority of the smaller bugs ironed out.

If there is a suitable set of tag wrappers and IDs in the document, it should be possible to use processCitationCluster(). That approach is nice because it should scale well to large-ish documents (but on the other hand it is significantly more complex to code and debug).

Yeah, what I'm doing now is moderately complex. On load, I get a list of all the citations within the text and keep them in state. Then if they get moved around (eg. dragged up to a higher position, or down to a lower position) the DOM is manipulated using a series of helper HTML data attributes (which store metadata about the citation as stringified JSON). It's a bit difficult to explain, but it's been working well so far.

Annnnnyway, let me get you that screencast of the stack trace and then we can go from there 👍

fbennett commented 8 years ago

It might be well off base, but have you considered running the processor in a worker thread? It has no DOM dependency, and that configuration makes things rather easy to debug. On Apr 30, 2016 00:07, "Derek Sifford" notifications@github.com wrote:

your code is written in some other syntax that I can follow only loosely (from a quick look around with Google, it might be typescript?).

Yep! Sorry for not mentioning that. (As an aside, TypeScript is life-changing. You should check it out if you ever get bored).

might not be able to fetch and return arbitrary locale data (that is, it looks as though "data" in the getLocale() function might behave something like a closure)

I'm 98% positive that the data payload is accessible during all points that citeproc needs it. I did run into an issue where my async request lagged behind the CSL.engine call (and it received an undefined variable), but I was able to fix that using promises.

One thing I suppose I should mention is: In your example you use XMLHttpRequest.open('GET',

, false) (where the false implies a synchronous request. That's no longer possible in all browsers, and will soon be going away completely (described here: https://developer.mozilla.org/en-US/docs/Web/API/XMLHttpRequest#open() ) In Web environments, that most often means keeping a global hash object for each in the processor context, where the functions can get ahold of them. Hmm, yeah I might not be able to use citeproc to its full potential then. I'm leveraging it in WordPress which makes things a bit more difficult (the most difficult being, I'm pretty much forced to be stateless. I have no access to the users' databases using ajax). It's probably possible to get working how you describe, but for now I think I'll just keep trucking as I currently am until I get the majority of the smaller bugs ironed out. If there is a suitable set of tag wrappers and IDs in the document, it should be possible to use processCitationCluster(). That approach is nice because it should scale well to large-ish documents (but on the other hand it is significantly more complex to code and debug). Yeah, what I'm doing now is moderately complex. On load, I get a list of all the citations within the text and keep them in state. Then if they get moved around (eg. dragged up to a higher position, or down to a lower position) the DOM is manipulated using a series of helper HTML data attributes (which store metadata about the citation as stringified JSON). It's a bit difficult to explain, but it's been working well so far. Annnnnyway, let me get you that screencast of the stack trace and then we can go from there 👍 — You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/Juris-M/citeproc-js/issues/4#issuecomment-215747316
dsifford commented 8 years ago

@fbennett I've never used web workers before, but I suppose I could look into it.

Here's the screencast of the stack trace... https://vid.me/k5v1

I tried to keep it to only the relevant sections. The locale that it puts in is definitely th-TH. Let me know if I'm overlooking something.

dsifford commented 8 years ago

So this is interesting:

The locale key is named en-US, however, after looking closer into it, it looks like it has the correct terms (in this case, thai)

image

Am I overreacting? Should this be working?

I've tried several different styles and, although those terms exist in the citeproc object, it still looks like it's spitting it back in en-US.

fbennett commented 8 years ago

So what seems to be happening is that you are loading the en-US locale data (the resolution of the screencast is a little grainy here, and I can't quite make out the characters, but I think that's right?), and then loading to the same en-US locale key again, but with Thai terms? I'm not sure what will happen there - if the actual en-US locale is reloaded for any reason, it will clobber the Thai data.

To run things per design, you would want to ship an array of locale keys (not a single string) to the getLocale() function in your code, and then load each locale separately (the serialized XML directly from the file) to a hash object, for targeted access via the retrieveLocale() function. Locale keys and data would then correspond, and things should always work as expected.

I'm curious about why it's reverting to en-US terms in this case, so I'll do a little testing here; but in some extended styles (although not in vanilla CSL), the processor switches locale modes on the fly, so it's important for the processor to store data separately for each locale.

fbennett commented 8 years ago

Had no luck with testing. The best I can figure is that the "data" object you are setting via the function is mutable, and it's either being clobbered by a subsequent load, or reverting with a change in context. Sorry I can't be more help!

dsifford commented 8 years ago

So what seems to be happening is that you are loading the en-US locale data (the resolution of the screencast is a little grainy here, and I can't quite make out the characters, but I think that's right?), and then loading to the same en-US locale key again, but with Thai terms?

Nope. If the person has a Thai locale, I only get the thai XML. That's what I pass into CSL. It'll never pass in en-US unless that is the person's locale.

To run things per design, you would want to ship an array of locale keys (not a single string) to the getLocale() function in your code, and then load each locale separately (the serialized XML directly from the file) to a hash object, for targeted access via the retrieveLocale() function. Locale keys and data would then correspond, and things should always work as expected.

I'm not sure if I'm fully understanding you here. Are you saying that I should be responsible for parsing the xml into key: value fields and then allowing CSL to access each key programatically? For example, something like this:


// assuming I already received the locale XML, parsed as JSON, and stored it in the class
// as this.localeData

this.localeData = {
    "locale": {
        "info": {
            "rights": {
                "_license": "http://creativecommons.org/licenses/by-sa/3.0/",
                "__text": "This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License"
            },
            "updated": "2012-07-04T23:31:02+00:00"
        },
        "style-options": {
            "_punctuation-in-quote": "false"
        },
        "date": [
            {
                "date-part": [
                    {
                        "_name": "day",
                        "_suffix": " "
                    },
                    {
                        "_name": "month",
                        "_suffix": " "
                    },
                    {
                        "_name": "year"
                    }
                ],
                "_form": "text"
            },
            {
                "date-part": [
                    {
                        "_name": "day",
                        "_form": "numeric-leading-zeros",
                        "_suffix": "/"
                    },
                    {
                        "_name": "month",
                        "_form": "numeric-leading-zeros",
                        "_suffix": "/"
                    },
                    {
                        "_name": "year"
                    }
                ],
                "_form": "numeric"
            }
        ],

    [ ... etc ]
  }

retrieveLocale(key) {
    return this.localeData[key]
}

Because right now all my retrieveLocale function is returning when called by CSL (assuming we're talking about thai) is literally this file as plain text https://github.com/citation-style-language/locales/blob/master/locales-th-TH.xml

fbennett commented 8 years ago

On the first item, I may have misunderstood - are you getting US terms in the rendered cites when the style is set to th-TH, or are the rendered cites coming out correctly?

On the second item, the processor handles all of the parsing. I only meant that the processor expects a call to retrieveLocale('en-US') to return the content of the file locales-en-US.xml. Since a single instantiation of the processor may call multiple locales in series (synchronously), the content of each file should ordinarily be available in the processor context - but the processor will take care of the parsing, you don't need to worry about that. If the output is coming out correctly, though, you can set all of that aside for the present.

fbennett commented 8 years ago

right now all my retrieveLocale function is returning when called by CSL (assuming we're talking about thai) is literally this file as plain text

Yep, exactly right, that's all that needs to happen.

dsifford commented 8 years ago

On the first item, I may have misunderstood - are you getting US terms in the rendered cites when the style is set to th-TH, or are the rendered cites coming out correctly?

I'm getting US terms for all locales.

If you look at 0:28 on the video, you'll see me hover the retrieveLocales function. Here's that function in ES5 syntax;

retrieveLocale: function(lang) {
  return data;
}

data is the already retrieved XML from the CSL locales repository in plain text. Basically the function doesn't care what you give it as an input param. It doesn't use it at all. All it does is returns the correct locale each time you call it.

Does that make sense / answer your question? (Sorry for being such a hassle!)

dsifford commented 8 years ago

Another interesting finding:

I just tested it with CSL-parsed data from the pubmed API. The locale that I gave it was the th-TH xml and I received the following console warning.

CSL: Warning: unknown locale eng, setting fallback to en-US

And here is the CSL payload from PubMed...

{
    ISSN: "0003-4738",
    PMID: "16515",
    author: Array[1],
    container-title: "Annals of allergy",
    container-title-short: "Ann Allergy",
    id: 0,
    issue: "5",
    issued: Object
    journalAbbreviation: "Ann Allergy",
    language: "eng",
    page: "311-5",
    page-first: "311",
    title: "The idenitity crisis of the allergy nurse associate-physician's assistant.",
    title-short: undefined,
    type: "article-journal",
    volume: "38",
}

So it looks like it may be ignoring the passed in locale. It looks like it's defaulting to the language field from the CSL itself. Is this expected?

fbennett commented 8 years ago

No, that warning isn't relevant. Name formats are language-dependent (in some domains, the full name is always used, and in others the ordering is fixed as family-name+given-name). When rendering a name, the processor hits a localeResolve() function to determine the canonical locale code for the name. It isn't selecting the style locale there.

The processor does not contain hard-coded locale terms anywhere, so somehow your function must be hitting the file locales-en-US.xml. You could test that by replacing it with another locale file; if tests continue to return English terms in output, we'll reeeeally have a mystery on our hands, but you should get whatever is in that file showing through in citations.

dsifford commented 8 years ago

Hmm very odd.

Here's where it gets REALLY strange. I have zero locale files saved locally. The only locale file that is even in existence when I make the call to CSL is the relevant locale file retrieved using HTTP.

In the case you saw in the video, that single file was the th-TH file. (I confirmed it using a breakpoint just as the CSL. engine was being called.

Totally stumped.

fbennett commented 8 years ago

Do try the file substitution; if you continue to get English terms, I'll definitely go all mea culpa (in italics!) and eat humble pie -- and I'll be eager to figure out where on Earth they could be coming from.

fbennett commented 8 years ago

Mea culpa duly issued. I'll dig into this a bit here and write back soon.

fbennett commented 8 years ago

Assuming that the file data is retrieved client-side via XMLHttpRequest() ... it's a long shot, but have you tried clearing browser cache?

dsifford commented 8 years ago

I greatly appreciate the help!

I suppose I haven't tried clearing the cache (how annoying would it be if that was the issue?!)

I'll give it a shot in the morning. 3 am here in Detroit. Way past my bedtime 😄

If you happen to have Docker and docker-compose installed (oh and gulp), my repo is pretty much turnkey if you fork it, install the node dependencies (npm install), run gulp build, and then docker-compose up -d

(After writing that it sounds like its more exhausting of a process than it really is)

If not, no worries. I'll try the cache thing tomorrow and report back. 👍

fbennett commented 8 years ago

No further luck here with weird nudges to the locale environment. If I run the processor without an en-US locale available, the processor crashes (of course). If I set the retrieveLocale() function to always retrieve th-TH, it renders output in that locale.

So I'll go back to where I was. The only way English terms can reach the processor is through the retrieveLocale() function.

fbennett commented 8 years ago

Docker is something I've been meaning to explore, so I installed it and your project. I was able to run ABT, with the raw download, but when I figured out how to install gulp and tried to do a build (after making a small change to the code in lib), gulp build blew up, first one way, and then another (screenshots). As far as I know, I didn't change anything in the filesystem between the change in behavior. Not sure what the deal is. Anyway, maybe a future tag will be happier.

Was able to insert a cite and play around a little. It looks great so far - but getting inline citations working (and footnotes) will be important for academics who use non-numeric styles. I do think you would enjoy working with web workers, if you are able to tease WordPress into running the processor in one.

screenshot-1

screenshot-2

dsifford commented 8 years ago

Re: Gulp -- Sorry about that, I forgot to mention you also need to install the correct TypeScript typefiles (using this package from NPM. Just install it globally, then run typings install and everything should compile.

getting inline citations working (and footnotes) will be important for academics who use non-numeric styles.

Yeah, that's probably the single-largest task I still have to get finished. It's almost hard for me to even begin thinking about, but I suppose my life might be made easier if I look into webworkers like you mentioned.

While we're on that topic, I'm not certain how to even generate inline citations using citeproc. Can you clarify?

Finally, I'm going to try the cache thing here in a few minutes. I'll let you know.

dsifford commented 8 years ago

No luck.

Still getting those weird console warnings too...

image

dsifford commented 8 years ago

image

dsifford commented 8 years ago

(Off topic, but I just added better instructions to the readme for getting up and running in my repo).

fbennett commented 8 years ago

The console warnings are irrelevant. See my explanation up-thread.

fbennett commented 8 years ago

If I get things going, I'll first confirm the failure. What are the steps to reproduce a bad cite in the page?

dsifford commented 8 years ago

The console warnings are irrelevant. See my explanation up-thread.

Ah, that's right.. Forgot you mentioned that.

What are the steps to reproduce a bad cite in the page?

  1. In the WordPress admin section, click options => general and change the locale to thai (near the bottom. Squiggly symbols -- don't know how to really describe it).
  2. Create a new post, and insert a reference using either PMID, DOI, or by importing a RIS file.

The reference will be parsed as en-US. (No squiggly Thai symbols anywhere to be found)..

Let me know if I can clarify further.

dsifford commented 8 years ago

Oh, and if you already have the repo cloned locally, be sure to pull the changes I just pushed to the master branch. (there's lots)

dsifford commented 8 years ago

Oh, and if you're running nodejs version 6, the SASS processor is broken (so the gulp builds will fail)... I'm switching to PostCSS now because I don't feel like waiting for them to fix it.

fbennett commented 8 years ago

Re inline citations, the manual is in the process of moving home. I'll have it back online soon; but I have an idea for a dynamic demo page that will be better than a windy explanation. I'll build that in the next day or two and send you a link.

fbennett commented 8 years ago

Do you have a sample DOI? Many cites will show no localizable terms anyway. Something with many authors, or an item with a translator as one of the creators should do it.

dsifford commented 8 years ago

Re inline citations, the manual is in the process of moving home. I'll have it back online soon; but I have an idea for a dynamic demo page that will be better than a windy explanation. I'll build that in the next day or two and send you a link.

I'm eternally grateful for all your help and generosity with this!

Do you have a sample DOI? Many cites will show no localizable terms anyway. Something with many authors, or an item with a translator as one of the creators should do it.

I do not. That was my fear; that I was using references that wouldn't matter. I tried a handful or randomly typed PMIDs in the hopes that with one of them I'd see some squiggly lines, but I had no luck. Perhaps my randomness wasn't random enough to find the right ones?

I did happen to come across some french and russian papers (as shown above).

fbennett commented 8 years ago

Here's what I get from a clean install, following the steps in the README:

gulp build
[06:18:42] Using gulpfile ~/src/academic-bloggers-toolkit/gulpfile.js
[06:18:42] Starting 'clean'...
[06:18:42] Starting 'webpack'...
[06:18:42] Starting 'sass'...
[06:18:42] Finished 'clean' after 253 ms

/home/bennett/src/academic-bloggers-toolkit/node_modules/gulp-autoprefixer/node_modules/postcss/lib/lazy-result.js:157
        this.processing = new Promise(function (resolve, reject) {
                              ^
ReferenceError: Promise is not defined
    at LazyResult.async (/home/bennett/src/academic-bloggers-toolkit/node_modules/gulp-autoprefixer/node_modules/postcss/lib/lazy-result.js:157:31)
    at LazyResult.then (/home/bennett/src/academic-bloggers-toolkit/node_modules/gulp-autoprefixer/node_modules/postcss/lib/lazy-result.js:79:21)
    at DestroyableTransform._transform (/home/bennett/src/academic-bloggers-toolkit/node_modules/gulp-autoprefixer/index.js:24:6)
    at DestroyableTransform.Transform._read (/home/bennett/src/academic-bloggers-toolkit/node_modules/gulp-autoprefixer/node_modules/through2/node_modules/readable-stream/lib/_stream_transform.js:159:10)
    at DestroyableTransform.Transform._write (/home/bennett/src/academic-bloggers-toolkit/node_modules/gulp-autoprefixer/node_modules/through2/node_modules/readable-stream/lib/_stream_transform.js:147:83)
    at doWrite (/home/bennett/src/academic-bloggers-toolkit/node_modules/gulp-autoprefixer/node_modules/through2/node_modules/readable-stream/lib/_stream_writable.js:313:64)
    at writeOrBuffer (/home/bennett/src/academic-bloggers-toolkit/node_modules/gulp-autoprefixer/node_modules/through2/node_modules/readable-stream/lib/_stream_writable.js:302:5)
    at DestroyableTransform.Writable.write (/home/bennett/src/academic-bloggers-toolkit/node_modules/gulp-autoprefixer/node_modules/through2/node_modules/readable-stream/lib/_stream_writable.js:241:11)
    at DestroyableTransform.ondata (/home/bennett/src/academic-bloggers-toolkit/node_modules/gulp-sass/node_modules/through2/node_modules/readable-stream/lib/_stream_readable.js:531:20)
    at DestroyableTransform.EventEmitter.emit (events.js:95:17)
bennett@black-slate:~/src/academic-bloggers-toolkit$ 

Not sure what's missing. nodejs version is v0.10.25, in case that's relevant.

dsifford commented 8 years ago

What version of node do you have? Promises were introduced maybe about a year or so ago (give or take). If your node version is super dated, then I suppose it might not know what they are?

I noticed you're on ubuntu. That used to be my daily driver as well, but I got fed up with them being notoriously slow with updating stuff (I'm on an arch distro now).

But anyway, yeah, check your node version node --version and let me know.

fbennett commented 8 years ago

It's v0.10.25

fbennett commented 8 years ago

Oh. Current is several levels higher in the major version number. That would be a problem, wouldn't it.

dsifford commented 8 years ago

Ah, yeah that's what I suspected.

To update your node to the latest one you have a couple different options..

  1. Use a version manager (nvm or n)
  2. Update explicitly to either node 5.x or 6.x with curl...

Version 5:

curl -sL https://deb.nodesource.com/setup_5.x | sudo -E bash -
sudo apt-get install -y nodejs

Version 6: (this is undocumented on nodejs's website, but it probably works. If not now, it will soon)

 curl -sL https://deb.nodesource.com/setup_6.x | sudo -E bash -
sudo apt-get install -y nodejs

Node also recommends installing the build tools:

sudo apt-get install -y build-essential

For full instructions see here: https://nodejs.org/en/download/package-manager/#debian-and-ubuntu-based-linux-distributions

dsifford commented 8 years ago

The one that you get from the standard apt sources is suuuuuuuuper behind.

fbennett commented 8 years ago

Okay, I pulled 4.x, and everything builds. When I run docker-compose and connect to localhost:8080, I get the site, and I'm in logged-in state as root (possibly credentials were cached from yesterday's trial). No academic-bloggers-toolkit plugin though. I get an initial warning that it's being disable because it is not present.

I could install over the wire, but that gets me the distributed source, I think, not the repo version. Not sure how to get the local code into WordPress.

fbennett commented 8 years ago

Nope, take that back. A restart brought it up. I've got some data, now to generate a reference ...

dsifford commented 8 years ago

Awesome!

Yeah, for future reference, when you're first building the containers (right after docker-compose up -d) you can use docker-compose logs -f to see build logs in the event that something errors in the build process.

I helped write the compose orchestration and made sure to make the error reporting detailed.

For a full rundown see this repo

Finally, when you're done, be sure to do docker-compose down -v (the -v flag deletes any associated volumes). When I was more novice with docker I made the mistake of not knowing about that little gem and it ended up bloating my /var directory to the tune of ~6 gigs of useless volumes!

Thanks again for lending your expertise! I'll be here for the next few hours if you need me for anything. 👍

fbennett commented 8 years ago

Try with this PMID: 12224981

:-)

dsifford commented 8 years ago

Magical!

So it is working! Wow, don't I feel stupid....

I figured I'd be getting a whole lot more squigglies than that!

Well gosh, I'm sorry for totally wasting a bunch of your time!

fbennett commented 8 years ago

No problem! This is really important work, and I'm glad to help. An editing environment with dynamic citation support will be a game-changer, and fulfill one of the aims I had when I started in on the CSL processor seven years ago. The processor has extended functionality for legal styles, which are a very tough nut to crack. Tech has been slow to penetrate that space for silly reasons, but citations are really important to legal discourse, and if the tool is built out to support legal styles, you'll have the legal community queuing up for support.

Beyond that, dynamic citation support in a web authoring platform will just generally lift the quality of public discourse. Text authored for the web is viewed as "informal" in large part because the tools for referencing are awkward and ugly. Fix that, and you have an entirely new picture.

To help out, I can build a sample web-worker instance and demo to use as a starting point, and place it in the processor documentation. Should be able to put something together in the next couple of days (touch wood).

fbennett commented 8 years ago

One other thought. Once inline citations + bibliography is implemented in the way it's handled in word processors, there would be no need to cook the citation format into the published page; the reader could be given control over the style, which would be flash and kind of neat.

dsifford commented 8 years ago

My thoughts absolutely mirror yours.

Re: You kindly offering to help -- That would be wonderful. I'll take as much help / tips / words of encouragement as you're offering to give!

Re: Dynamic citations. Yeah, that was the direction I was interested in heading (once I got all the groundwork laid out). I agree that would be a totally cool feature to have.