jacobwindsor / MetabMaster

Learn about metabolic pathways using Pvjs and interactive descriptions
1 stars 0 forks source link

Citations #20

Open jacobwindsor opened 7 years ago

jacobwindsor commented 7 years ago

Something that would be extremely useful for these interactive descriptions is Wikipedia style citations. They should look something like this [[1]()] with a reference list at the bottom.

This should be fairly easily achievable with a Showdown extension in a similar vein to Kaavio Showdown and @larsgw's citation.js. P.S: @larsgw, I'm putting this issue here but ultimately it is an extension to WikiPathways.

The only issue with this method is that the descriptions do not belong to the GPML, they just reference some GPML via a WikiPathways ID. The GPML can have some references and it's quite likely that many references will be shared among the GPML and the description. @egonw any thoughts?

egonw commented 7 years ago

Not entirely following you... where does the GPML come in?

larsgw commented 7 years ago

I looked into Showdown extensions, but unfortunately they don't support async yet (see https://github.com/showdownjs/showdown/issues/322). Currently, Citation.js doesn't support async either (and loads resources synchronously), but I'm planning to change that (see https://github.com/larsgw/citation.js/issues/29). I could write an extension with sync, or I could look into PR'ing Showdown, and I'll probably do both.

jacobwindsor commented 7 years ago

@larsgw why do you need to do it asynchronously?

larsgw commented 7 years ago

e.g. to fetch metadata from an article where only the wikidata ID is given in the markdown. I currently use sync-request, but synchronous requests are strongly discouraged (for good reasons), so I would like to make it work asynchronously.

jacobwindsor commented 7 years ago

@egonw I mean the the references that may occur in the Biopax element. I'd imagine there would be a lot of overlap between the references in the description and in there. It's not a big problem I suppose but it's a little confusing to have two bibliographies - one for the references in the GPML and one for the references in the description.

jacobwindsor commented 7 years ago

@larsgw ah I see. I can imagine that's going to be a large PR in Showdown. You don't really need an async extension though - see this workaround in markdown-it.

With Kaavio Showdown, I see it more like this.

  1. Create a custom method using citation.js to get the required citation information from a string like ^[some citation]. This method will replace it with the required data for Showdown so you get something like cite[author, year, title]
  2. Add a custom extension to Showdown to replace the cite[author, year, title] with the Wikipedia style reference, adding everything to the bibliography at the end.

The custom method could be used before calling makeHTML() or makeHTML could just be overrided with the custom logic.

It's not as nice as having an async extension in Showdown itself, but hey it works :)

larsgw commented 7 years ago

I was already planning to make a small tool replacing annotated HTML elements with references, something like this (cjs is the DOM abbr I use for Citation.js):

<div class="cjs-replace" data-input="..." data-option-output-style="..." etc></div>
<script src="path/to/said/tool.js"></script>

That would come in handy when used in web pages: simply put in one HTML element per reference, excluding script element, and you're done.

Using that with a showdown plugin that replaces ^[some citation] and/or cite[author, year, title] would work, but I'll have to look into not making synchronous use too complex for what it is. Perhaps something like this:

As a note: the last bullet point would require the annotated HTML to already be that link, as there is no client side code to change that, but that shouldn't be a problem, and it would provide a (somewhat) nice thing to look at while waiting for https://wikidata.org/entity/Q1.json to load.

Providing output options would be difficult, but the only real option you would be missing out on is the citation style. Maybe I should just come up with a standard one, that everyone agrees with. Jokes aside, it is an issue. The only solutions that come to mind are document-wide declaration (un-markdown-y, and not self-contained ^[...] statements) or repeating the option everywhere (annoying as well), so I'll look into this later.

Anyway, that's more related to the general concept of the plugin, not how it would work asynchronously. The setup above still stands, and this way, it would degrade gracefully, and both tools would be able to be used independently.

jacobwindsor commented 7 years ago

Awesome!

I'm not sure I'm 100% following you but is this correct?

  1. User inputs markdown like ^[<DIRECT CITE INPUT>] or^[<AUTHOR>, <YEAR>, <TITLE>, ...]
  2. This is then parsed to a cjs element
  3. If the user uses a direct cite input, CJS gets the relevent information async
  4. If the user uses author, year, title input then CJS just renders the correct citation

If this is the case then I see one issue with it. In most cases the dev (including me) would like to know when all of the citations are finished rendering. With this case, I assume the CJS elements are separate CJS instances. You can of course display something like [loading citation...] while the citation loads but that's not so nice.

I don't really understand what you mean by output options.

P.S I'd use a <cjs author="..." year="..." title="..." wikidata="..."></cjs> custom element. As a note to the loading you could then give it a required manager attribute which will register the element to the manager, taking control of loading and possibly other things. I'm not sure if that's good practise but it's the first thing I thought of

larsgw commented 7 years ago

It's more like this (sorry for the overkill):

cjs-tool-showdown

<async?> is about if asynchronous operations need to be done, not if it is available. In the diagram, unavailability of async is assumed.

A callback in the HTML tool would be possible (find all elements to be changed; make a boolean array with a false value for each element; when one is loaded, switch its boolean and check if all are now true). This can of course, be built into the tool itself.

With output options I mean the output options for Cite, i.e. language; 'real' or string representation of the following; HTML, plain text or JSON; last but not least, output style, which can be CSL-JSON, BibTeX, 'BibTeX JSON' (parsed BibTeX) or formatted citations with a provided citation style. Only the citation style and possibly the language are relevant in this case, I think.

I'd love to use custom elements, but it seems a lot of trouble with browser support, polyfills, possibly polyfills for the polyfills, licences for polyfills and everything related. I'll have a look at it.

jacobwindsor commented 7 years ago

Not overkill at all, it's good to see a diagram sometimes.

I'm wondering why you're bothering with sync operations at all. It would simplify the API a lot if everything returned a promise/observable regardless of whether an HTTP fetch had to be called. If i'm barking down the wrong tree then please say.

I don't know exactly what your API is like but say you have a cite method like so:

cjs.cite(options: {wikidata?, doi?, author?, year?,...}): Promise<any> {
    let data;
    if (options.wikidata || options.doi) {
        // Do the fetch stuff here
        data = {<some object with Cite data>}
    else {
        // Just build the data from author, year, title etc.
        data = {<some object with Cite data>}
    }
    return new Promise((resolve,reject) => resolve(data));
});

It's much easier to work with when you know the response is only one type. The client doesn't have to care about whether data is being fetched via HTTP or not. It always returns a promise no matter what.

Custom elements are really nice, we actually use them in Pvjs. If you follow the guide from Google I linked to it's pretty straightforward. The available polyfills are pretty good and you don't need to worry about super old browsers IMO.

With custom elements, this is how I see it. The custom element uses the aforementioned cite method (or whichever method is in your API). Until the promise is resolved there will be some kind of loading logic. When the promise has resolved, just set the inner HTML of the element to the resolved data.

Including Markdown you get this flow (sorry but I'm too lazy to make a nice diagram):

^[direct cite] OR ^[author, year, title]
                      |
                      |
                      V
<cjs bib="<selector for bibliography>" doi="<if given>" wikidata="<if given>" title="<if given>" author="<if given>" year="<if given>"></cjs>
                      |
                      |
                      V
Custom element uses the cjs cite method to either grab the data from another web service or just build it up from the author, year, and title.
                       |
                       |
                       V
<cjs doi="<if given>" wikidata="<if given>" title="<if given>" author="<if given>" year="<if given>">
    <p>[<a href="#linktoref">1</a>]</p>
    // Also fill in the bibliography at the end. With the bibliography selector
</cjs>

Let me know your thoughts on this flow. I think it's more straightforward but you know how it works at the moment so I could easily have overlooked things.

I'm not sure what the HTML Tool refers to, is it the tool that create the <div class="cjs"...></div>?

jacobwindsor commented 7 years ago

You can see the very minimal amount of polyfills used in Pvjs here. The setup for the custom element is here.

I should note that it hasn't been cross browser tested yet.

larsgw commented 7 years ago

First version done. HTML handling isn't as nice as what it is going to be, but that's partly because there's no option to amend HTML in both citeproc-js and Citation.js, or at least not the way I'm looking for. Links don't work yet, and sorting and shown text isn't really finished either, but I'm working on it.

It should work simply by requireing the module. No values need to be assigned (which I don't like, but that's how the boilerplate worked; I'll have a look at it later). Example code (not es6, just quick code):

var Showdown = require('showdown')

require('citation-js-showdown') // registers extension

var converter = new Showdown.Converter({ extensions: ['citation.js'] })

var text = 'test ^[Q1,Q27795847] test'
var html = converter.makeHtml(text)

console.log(html)

Output is like this:

<p>test <sup>[1]</sup> <sup>[2]</sup> test</p>
<div class="csl-bib-body">
  <div data-csl-entry-id="1" class="csl-entry">universe. (n.d.).</div>
<br />
  <div data-csl-entry-id="2" class="csl-entry">Wohlgemuth, G., Mehta, S. S., Mejia, R. F., Neumann, S., Pedrosa, D., Pluskal, T., … Fiehn, O. SPLASH, a hashed identifier for mass spectra. <i>Nature Biotechnology</i>, <i>34</i>, 1099–1101. https://doi.org/10.1038/NBT.3689</div>
<p></div></p>

The weird HTML probably has to do with interferences with other rules, I'll have a look at that too later. As you can see, the IDs aren't shown in the references, but they do with a simple CSS rule:

.csl-entry:before {
  content: '[' attr(data-csl-entry-id) ']: '
}

Again, first version. I already had to change Wikidata parsing (https://github.com/larsgw/citation.js/issues/35) to push the new version with the new exposed parsing methods which I needed for this code. Fixing all this will need changes to Citation.js that aren't even officially on the roadmap yet, so that may take a while, depending on how much other stuff I have to do.

larsgw commented 7 years ago

I forgot to mention: Citation.js doesn't actually support async yet, so I didn't have to go through all that trouble in this version. Custom elements would be a part of the HTML tool (the one that replaces the <cjs> with citations), so that's why that's not included yet.

jacobwindsor commented 7 years ago

Brilliant work!

My only comment is to unit test it. Given that it's so easy to unit test (just provide an md and html file) it's well worth it. I found quite some unexpected bugs by introducing them.

Re: weird HTML - if it is to do with interference with other rules then I would just change the syntax. Something like cite[input] is fine too.

jacobwindsor commented 7 years ago

In terms of integrating this into MetabMaster - I would like to give it a try at least. I'm a bit cautious about using it because of the sync requests in Citation.js. I would like to creating a citation-dev branch and possibly serve that via staging.metabmaster.jcbwndsr.com .

If you have time than please feel free to create a PR. I'm currently writing my thesis so not sure I'll have time for a little while. I would like to include this feature in there though and it is definitely something that should be included in WikiPathways if and when that happens

larsgw commented 7 years ago

Re: Unit testing - definitely. Publishing as v1.0 was an accident. I'm aware of the problems (well, at least some of them). Re: Re: weird HTML - I think it's the output <div>s, <br>s and \ns conflicting, I'll look at that too. Re: sync - I'm working on the todo-list at Citation.js to try to get to async as fast as possible, but it'll take a while, especially now holiday is almost over. Re: PR - I'll take a look at it

larsgw commented 7 years ago

What issues are you having? I'm working on tests and improvements, so it would be nice to know if I missed something. The part conflicting is the divs outputted by a third-party library (in string format), so it's hard to do something about that.

EDIT: The actual conflict was caused by showdown, and is now fixed. I raised an issue: https://github.com/showdownjs/showdown/issues/387.