Change strategy of how citationstyles gets list of items to style from view from raw HTML to JS structure

joelburton commented 11 years ago

Currently, there is serious mismatch between CMFBibAT and collective.citationstyles.

CMFBibAT provides page template views with citations on them, and, using a viewlet and JS, citationstyles hides the actual citations on the view and injects reformatted citations. This works well but does not handle cases of batching neatly and does not allow for a reasonable solution to cases where CMFBibAT wants to sort the items differently than citationstyles.

Going forward, it would be helpful for the underlying view (currently provided by CMFBibAT) to provide a JS data structure with all citation items it wants to return, irrespective of batching. It should also provide information on the batching the view is using (# of the items in batch and starting item of batch). It could then provide rendered citations in its default, non-CSL styles as it does now.

Rather than our looking for the rendered citations on the page directly (which would be too tied to current sort order/batching), our JS code work work on the JS structure. We could then inject stylized citations onto the page, using the same batching limits as the underlying view. We could benefit from the batching UI (ie, next/prev links, etc) provided by the view.

This would be a significant change to CMFBibAT. This is likely to be an architectural change when this moves to a more modern-style product.

alecpm commented 11 years ago

Seems like rendering all Bib items in the page (even if only as JSON) could be a problem for relatively large bibliographies. Wouldn't an on-demand solution with batching and sorting provided within a JSON API make more sense?

cewing commented 11 years ago

@alecpm the problem we face is that different bibliography style sheets (in the CSL sense) may implement their own particular ordering. So long as we are in fact using client-side rendering to display the bibliographic items, I think it is going to be a hard thing to get around how we reconcile "plone" ordering of items and the ordering as determined by a CSL style.

My preference would be to shoot a big JSON blob at a browser, load the entries into an engine there, and then deal with pagination entirely on the client side. But if the Python version of the citeproc engine stabilizes, perhaps we could do rendering server side and deal with pagination there. The issue of differences in Plone order and CSL order remains a tough nut to crack, though.

joelburton commented 11 years ago

If we had some way to know what the ordering would be, we could eliminate this and produce just the items on the page, as the iterator could do the same ordering as the JS and the same batching at this view. Would this be possible somehow? (ie, is sort order something we can introspect from the styles? Or add to the control panel as something they tell us about styles?)

cewing commented 11 years ago

The stylesheets are just xml, so they should be introspectable. It'd just be a matter of determining the sort a style desires, and then implementing it on the python side.

alecpm commented 11 years ago

I wonder if the CS JS engine could tell us something about the style sorting which we could pass as a parameter to JSON queries. Batching can be pretty critical in terms of server side performance, in addition to its usefulness on the UX side.

joelburton commented 11 years ago

Thinking about it, we'd want to have knowledge of sorting anyway--since we wouldn't know what items to ask for from @alecpm's ideal API, as, without sorting knowledge in Python, we still have to ask for everything and let CSL sort out who appears here.

Yuck. This different sorting requirement is a PITA.

cewing commented 11 years ago

If there is a CSL engine API for determining the sort provided by the currently loaded style, we could potentially do as @alecpm suggests, and send that info along to the json view. I have not yet dug deeply enough into the CSL engine api to know what is possible from that side of things.

alecpm commented 11 years ago

if loading the full dataset ends up being a necessity, I would highly recommend separating that render from the ZPT render, and having loading it async with js from a separately cacheable and gzippable API call. One reason is that a very large html page could significantly impact Diazo rendering time. I ran into something on UMP where a ZCatalog bug and plone.batching bug interacted to trigger 3k un-batched search results, rendered in a ZPT (as html, rather than JSON, so it might not be entirely instructive). The debugger seemed to indicate that much of the time was being spent in the Diazo pipeline, after the page render had completed.

collective / collective.citationstyles

Change strategy of how citationstyles gets list of items to style from view from raw HTML to JS structure #16