Juris-M / zotero

Juris-M is a variant of the free and friendly Zotero research platform, with support for legal and multilingual materials.
https://juris-m.github.io
Other
78 stars 12 forks source link

Options for support of consolidated bibliography items #81

Closed fbennett closed 4 years ago

fbennett commented 4 years ago

(This is a continuation of discussion that began roughly at https://github.com/Juris-M/zotero/issues/76#issuecomment-697355114.)

This is a proposal for consolidation or special formatting of multiple chapter items in the bibliography. The notes in this post are initial thoughts, open to change (or abandonment) in light of further discussion. The starting point is a long-standing feature of Jurism styles that applies an internal legislation_id to items of four types (bill, legislation, regulation, treaty), generated by concatenating the following fields: type, title, jurisdiction, genre, volume, container-title original-date, and issued. This results in a value in legislation_id that is shared among items that match on all fields. This is needed because such items can accept a section value that is (in ordinary Zotero terms) a pinpoint. The shared legislation_id value allows cites to the same statute, but from a different item, to be treated correctly as subsequent references to the statute. The statute is then cited only once in the bibliography.[1]

The same generated-ID strategy can be applied to the problem of chapters with a separate chapters_id, but the output requirements will differ from those of legislation. Requirements will also vary between styles, raising the need for some additional control attributes. Here are (I think) the output requirements:

These could be supported with a style (or bibliography) option with a name like consolidate-chapter-items taking options that maybe follow the nomenclature above:

In addition, to support (iii) and (iv), a new condition chapter-count with a value of the number of chapter entries submitted for rendering would be added, to allow different CSL structures to be applied to the single and multiple forms.

This would be an improvement on the status quo, but it would not be fully automatic in Zotero/Jurism. In a simple implementation, at least, if (i) or (iv) render a cross-reference to the full book item, the book item itself would need to be inserted into the bibliography separately, if not cited directly in the document. Separately, (ii) & (iii) would require some non-intuitive care in sort macros, to avoid sorting by an author or title not rendered in the bibliography.

(Edit: The single-each-multi-each attribute is unnecessary. It can be implemented using all-each and a chapter-count conditional in the bibliography CSL.)

(Edit: Ha. Likewise single-each-multi-one can be implemented with all-one and a chapter-count conditional. Where this leaves us is with a simple binary choice in the bibliography option of either consolidating chapters or no. So in the end, the option values become just true and false.)

[1] The consolidation feature in the bibliography was removed from citeproc-js at some point, but it will be restored in the next release.

fbennett commented 4 years ago

This has been implemented. Tests that exercise the new functionality are here:

fbennett commented 4 years ago

After a round of updates to address feedback, documentation on the container-consolidation support is available, together with an update to the beta version of the client.

(If you have yesterday's beta installed, it's probably safest to remove any Commenter creators on the Chapter type before upgrading. I'm not sure, but those might cause a problem for sync.) @bwiernik, @georgd

denismaier commented 4 years ago

Concerning container-item-count: there could be an equivalent for the notes. Something like container-seen-before, container-subsequent or so. That would be uses if you don't consolidate containers but if you want to shorten citations if a container has been seen before.

fbennett commented 4 years ago

Concerning container-item-count: there could be an equivalent for the notes. Something like container-seen-before, container-subsequent or so. That would be uses if you don't consolidate containers but if you want to shorten citations if a container has been seen before.

Happily that is already covered. For tracked item types, the subsequent condition looks to the last rendering of an item with a matching container. To check the last item specifically, you need to test for strict-subsequent in the subsequent context. The position_TwoStageChapterSubsequent test illustrates the behavior.

(It works out neatly in the processor internals: since ambiguity testing is done with the plain subsequent form, all items in the set will be reformatted if a citation calling any one of them is moved in the document.)

denismaier commented 4 years ago

Great. What about tracking the first note number of a consolidated container?

fbennett commented 4 years ago

Thanks for this -- in addition to raising need for a second back-ref counter itself, looking into the issue exposed a bug. I think we'll need a second back-ref variable. Something like first-container-reference-note-number?

denismaier commented 4 years ago

Thanks for this -- in addition to raising need for a second back-ref counter itself, looking into the issue exposed a bug. I think we'll need a second back-ref variable. Something like first-container-reference-note-number?

Looks good to me

denismaier commented 4 years ago

Happily that is already covered. For tracked item types, the subsequent condition looks to the last rendering of an item with a matching container. To check the last item specifically, you need to test for strict-subsequent in the subsequent context. The position_TwoStageChapterSubsequent test illustrates the behavior.

I'm not sure but wouldn't it be better the other way round? Like a new condition that includes tracked containers, but keeping subsequent behaviour as is. Maybe I'm just misunderstanding how this is supposed to work.

fbennett commented 4 years ago

As currently set up, if the behavior of subsequent were not changed, a reference to a new item under the same container would evaluate false for subsequent, failing over to else to be rendered as a first, full-form reference, which is not what we want. The alternative would be to cast a first-class container-subsequent condition that could be added after the subsequent test. That would have a cleaner appearance in the CSL, and (I think) would work fine. I'll see if I can brew up code to make it work that way so we can take a look.

bwiernik commented 4 years ago

I think container-subsequent would be a much clearer condition. Let's go with that direction.

denismaier commented 4 years ago

I think container-subsequent would be a much clearer condition. Let's go with that direction.

I think so too.

denismaier commented 4 years ago

Another question: will it be possible to add full container details to the first citation or to the first reference in a bibliography, and a short reference afterwards, but without adding the container itself to the bibliography?

fbennett commented 4 years ago

Another question: will it be possible to add full container details to the first citation or to the first reference in a bibliography, and a short reference afterwards, but without adding the container itself to the bibliography?

The container itself is out of scope for the consolidation machinery. The position operators can distinguish between first reference to any item in a given container, subsequent references to items in the container, and subsequent references in the usual sense (and ibid/ibid w/locator). So you can give full details in the first reference, if it is recorded on the item: whatever you can tease out of CSL.

fbennett commented 4 years ago

I have working code for container-subsequent already, but it's late here and there is a problem I'll need to address tomorrow. I've changed the numeric values of the position flags in the code, and it now occurs to me that the literal values will have been seeded into documents all over the Internet, so distributing the processor with new flag values would trigger quite a bit a chaos. The fix for it is clear, but I need to sleep. We should be able to wrap this up tomorrow, though.

fbennett commented 4 years ago

Another question: will it be possible to add full container details to the first citation or to the first reference in a bibliography, and a short reference afterwards, but without adding the container itself to the bibliography?

In the bibliography, we could add a condition container-item-pos or so that evaluates true for the first item. It would be simple to add it, if there are suggestions for naming and syntax.

denismaier commented 4 years ago

Another question: will it be possible to add full container details to the first citation or to the first reference in a bibliography, and a short reference afterwards, but without adding the container itself to the bibliography?

The container itself is out of scope for the consolidation machinery. The position operators can distinguish between first reference to any item in a given container, subsequent references to items in the container, and subsequent references in the usual sense (and ibid/ibid w/locator). So you can give full details in the first reference, if it is recorded on the item: whatever you can tease out of CSL.

So, adding the container is done manually?

denismaier commented 4 years ago

Another question: will it be possible to add full container details to the first citation or to the first reference in a bibliography, and a short reference afterwards, but without adding the container itself to the bibliography?

In the bibliography, we could add a condition container-item-pos or so that evaluates true for the first item. It would be simple to add it, if there are suggestions for naming and syntax.

Sounds good. What about container-subsequent-in-bibliography? That's again the other way round, but I think it's more intuitive that way. After all, the current behaviour should still be the default.

denismaier commented 4 years ago

By the way, I wonder how this all will affect vanilla CSL. Thoughts cc @bwiernik @bdarcus? @jgm @cormacrelf

bwiernik commented 4 years ago

(@denismaier In general, it seems to go smoother if just a few people figure out a solution before we tag in a bunch of others.)

As I've been following this discussion, I've been bearing vanilla CSL in mind. I think this approach as it is being developed could generally be adopted as-is.

@fbennett Two thoughts.

First, so the Annual Review style of:

Author, A. A. (2010). First chapter title. In Editor (2010). Author, B. B. (2010). Second chapter title. In Editor (2010). Editor, E. E. (2010). Book title.

is not possible with this system? That seems like a huge limitation that pretty severely limits the applicability to basically only work with note styles, rather than in the bibliography or with author-date styles. Can we think through options to make it cover the fuller range of styles?

How do consolidated items currently appear in the bibliography? Just as full references?

Second, some of this position/note-number management problems you are solving here seem reminiscent of Chemistry-style compound citations (e.g., managing the citation group number and the citation item number). Could some of the new logic here potentially be reused (obviously the attributes would be distinct entities)?

fbennett commented 4 years ago

So, adding the container is done manually?

Yes, unless it is cited directly in the document, user would need to insert it into the bibliography as an uncited item.

fbennett commented 4 years ago

Another question: will it be possible to add full container details to the first citation or to the first reference in a bibliography, and a short reference afterwards, but without adding the container itself to the bibliography?

In the bibliography, we could add a condition container-item-pos or so that evaluates true for the first item. It would be simple to add it, if there are suggestions for naming and syntax.

Sounds good. What about container-subsequent-in-bibliography? That's again the other way round, but I think it's more intuitive that way. After all, the current behaviour should still be the default.

Will do.

fbennett commented 4 years ago

First, so the Annual Review style of:

Author, A. A. (2010). First chapter title. In Editor (2010). Author, B. B. (2010). Second chapter title. In Editor (2010). Editor, E. E. (2010). Book title.

is not possible with this system? That seems like a huge limitation that pretty severely limits the applicability to basically only work with note styles, rather than in the bibliography or with author-date styles. Can we think through options to make it cover the fuller range of styles?

It will support that style, with the one limitation that I noted at the outset. If the "Book title" reference is not cited directly in the document, it will need to be inserted as an uncited reference. For the chapter entries, the cross-reference indicator can be generated from the content of the chapter item. In the example above, if there are multiple works edited by Editor in 2010, disambiguation will not be performed, so that is for the author to sort out. It's a limitation of this simple mechanism. If all chapter entries are to incorporate the book entry by reference as shown in the example, consolidation is of course not needed at all, it would change nothing over the status quo. If a work from which only one chapter is cited is to be formatted differently, the tracking test can handle that.

It would be an overstatement to say that the mechanism is useful only for footnote-only styles. The purpose of digging into it has been to support @georgd's work on a set of Austrian legal styles that require consolidation in the bibliography, and it seems likely to address needs there.

fbennett commented 4 years ago

How do consolidated items currently appear in the bibliography? Just as full references?

The consolidation mechanism provides conditions to identify the context (first-occurring or subsequent reference to container, single or multiple items cited to the container). What gets rendered in a given context is arbitrary CSL code.

fbennett commented 4 years ago

Second, some of this position/note-number management problems you are solving here seem reminiscent of Chemistry-style compound citations (e.g., managing the citation group number and the citation item number). Could some of the new logic here potentially be reused (obviously the attributes would be distinct entities)?

In this consolidation mechanism, grouping depends on item content, which is (relatively) static and easy to track. In the chemistry style, grouping depends on mutual first-reference cite position, which is subject to change, with knock-on effects across all derived references when the document is edited. It's a whole different kettle of fish.

fbennett commented 4 years ago

Okay, the next round is ready for review. There have been a bunch of little changes, here are the elements.

Jurism beta installer (Mac only)

bwiernik commented 4 years ago

This is looking pretty good. Have a few thoughts, but late here, so just make two minor comments.

1) To match CSL naming conventions, can the item type be legal_commentary. Underscore is used as a substitute for a space--hyphen indicates a hierarchical relationship (e.g., article subtype journal; the place attached to a publisher).

2) Thinking about if there is a safe way to render the container for consolidated items without requiring manually adding it as an uncited item. Is your concern constructing a full separate item data from the container data?

fbennett commented 4 years ago

This is looking pretty good. Have a few thoughts, but late here, so just make two minor comments.

To match CSL naming conventions, can the item type be legal_commentary. Underscore is used as a substitute for a space--hyphen indicates a hierarchical relationship (e.g., article subtype journal; the place attached to a publisher).

Will do, before Jurism release.

Thinking about if there is a safe way to render the container for consolidated items without requiring manually adding it as an uncited item. Is your concern constructing a full separate item data from the container data?

Possibly it could be done, but I won't be attempting it. There are a couple of difficulties that suggest it wouldn't be worth the candle.

It would be nice to automate this, but I don't think it would be worth the effort.

denismaier commented 4 years ago

Thinking about if there is a safe way to render the container for consolidated items without requiring manually adding it as an uncited item. Is your concern constructing a full separate item data from the container data?

Possibly it could be done, but I won't be attempting it. There are a couple of difficulties that suggest it wouldn't be worth the candle.

With some sort of cross-referencing between items we could establish links between the parent and child items, and then automatically add the parent if there are enough child items in the bibliography. That's how that would work with biblatex. It's possibly as automatic as possible, without the downsides brought up by @fbennett.

fbennett commented 4 years ago

If the calling application is aware of the parent/child relationship, it can assure that the parent is included in the input. The processor wouldn't need to be involved, it would just run the CSL.

fbennett commented 4 years ago

Thinking further about parent/child links though, that would open up fresh issues for a calling application like Zotero/Jurism that would complicate things. Users would have the option of adding parent items manually, or setting links within the application DB. If links are not comprehensive, the behaviour will vary by document, which would need explaining when users encounter the inconsistency. Another issue would be that some items would have multiple parents---such as an addendum to a parent treaty, published in a reporter---and one could be handled automatically, the other not, and that would need explaining. My sense of it is that in a batch-processing system like BibLaTeX, the output needs to be picture-perfect on first run, which justifies placing a greater curation burden on users. In contrast, in an interactive environment like the Zotero word processor plugins, particularly working off a system designed to absorb materials from eclectic sources, the ease of manual adjustments tends make that the path of least resistance. In Zotero, at least, it would be one for @dstillman to decide, but in the end I think you would want to choose one approach or the other, for the sake of simplicity at the user end.

bwiernik commented 4 years ago

What if the items had a container-id or container-citation-key variable? That could be compared against existing items in the bibliography?

fbennett commented 4 years ago

How would that information be maintained on individual items? (It would need to be the Zotero itemID, if the processor were to call retrieveItem() on it.)

denismaier commented 4 years ago

Csl 1.0.2 adds a citation-key variable that could be used for this. But I don't know if citeproc-js can use this for this operation.

bwiernik commented 4 years ago

Looking at various options, I don't see one that isn't a huge pain in the ass. So, let's stick with the "add an uncited item" approach.

georgd commented 4 years ago

Implemented and testing in https://github.com/georgd/jm-styles/commit/852a8d4b49c611e46a007f2dc6893e70bccead15 and https://github.com/georgd/jm-styles/commit/14c7f853af07b16aa4e15e289a3155a0f84cb2db:

This works well in the citeproc-test-runner.

However, the Jurism-Beta linked above, doesn’t play nicely. jm-leg-cit-ohne-verzeichnisse.csl is reported as invalid and in the other style, container-consolidation doesn’t work. Should I clean up any previous installation of Jurism?

fbennett commented 4 years ago

Good morning! I'm about to create a small last-hurdle headache for you, pushing some tidy-up changes to the attribute names. The docs have been updated, and I will amend the summary and links above shortly. When the changes are all in place, I'll push an update to the Mac beta, and we'll see if it behaves better.

fbennett commented 4 years ago

The beta has been refreshed, Help -> Check for updates should give you the latest. After adjusting things to reflect the new attribute names (shown in the summary above), how are your results?

georgd commented 4 years ago

Ok, found out why it didn’t work. The link above is referencing the release version. I bravely exchanged 'beta' for 'release' in the URL and now I do have the newest version :-). I’ll report about the tests later.

fbennett commented 4 years ago

Oh, sorry! That was careless, I'll fix the link for posterity.

On Monday, September 28, 2020, Georg Mayr-Duffner notifications@github.com wrote:

Ok, found out why it didn’t work. The link above is referencing the release version. I bravely exchanged 'beta' for 'release' in the URL and now I do have the newest version :-). I’ll report about the tests later.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Juris-M/zotero/issues/81#issuecomment-699787412, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAASMSSXWPXRFAUUQKJ3I23SIAQEDANCNFSM4RXZXZGA .

fbennett commented 4 years ago

Closing, as the features here are documented in ReadTheDocs, and seem to be complete and working in the beta.