retorquere / zotero-better-bibtex

Make Zotero effective for us LaTeX holdouts
https://retorque.re/zotero-better-bibtex/
MIT License
5.27k stars 284 forks source link

Enable ODF-scan format in CAYW #312

Closed adam3smith closed 9 years ago

adam3smith commented 9 years ago

I've started testing the CAYW feature and it's working very nicely for me together with AutoKey on linux. I'll blog on that when I get a chance.

It would be wonderful if you could add ODF-scan as a third format option in addition to mmd and pandoc to allow users to use this with our tool e.g. on google docs or with Scrivener if they want live Zotero citations.

The format would be: {prefix|human readable citation|locator_label locator|suffix|unique identifier} where human readable citation is a string that will get overwritten by the citation style, but should allow the user to identify the citation. We just use author date but if need be citekey would do.

the locator is preceded by a locator_label like p. and a space -- full list under "Locators" here

The unique identifier is, for items from personal libraries: zu:libraryID:itemID for items from groups zg:groupID:itemID Not having looked at your code, I'd hope that you could just recycle our (translator) code to generate the unique ID from an item's URI: https://github.com/Zotero-ODF-Scan/zotero-odf-scan/blob/master/plugin/resource/translators/Scannable%20Cite.js#L58

What do you think? (cc @fbennett )

retorquere commented 9 years ago

Should be fairly easy, if you can clarify a few things for me.

For google docs it should in principle be possible to call the picker URL directly and do real integration, including paste, but my GScript-FU is weak, so I don't know for sure.

adam3smith commented 9 years ago

Cool, let me answer that in order 2,1, 3: 2) We use the library ID because it's part of item.uri, which is what the Zotero LO (and Word) add-on uses to link citations to Zotero items. On scan, we then generate a Zotero Reference Mark with the item.uri and when the use then loads the doc in LO and picks a citation style, Zotero calls up that item. Does that make sense?

1) After the first sync, the your library gets assigned a global ID, which becomes part of the item's URI (before the hand that's just "local" in the URI). It's the same library ID you'd use for API requests. I'd think you should be able to just extract it from item.uri, which you should have access to from any Zotero item object as well as from whatever data the picker returns to you (since the picker would include it when used in LO).

3) What we do is to substitute the short title/title if an item has no creator and print "no date" where an item has no date. That'd be nice to do and will produce something that makes sense in all cases I can think of. @fbennett also has some additional logic for legal citations in there, which I'm sure he'd be delighted to have included -- I figure the easiest is for you to just read/copy that from the ScannableCite translator code. But as I said, that part of the cite is just for human-readability before the scan, so if you don't want to bother with the details, that'd be fine, too.

retorquere commented 9 years ago

I have a test version up at http://tempsend.com/FFB0203852, code is at https://github.com/ZotPlus/zotero-better-bibtex/tree/scannable-cite, format is scannable-cite. I'll have a look at the human readable label, and I'll seriously have to look into that library ID thing -- I currently have no idea where it lives in the Zotero DB, which is where I have to take it from.

retorquere commented 9 years ago

The actual code for new formats is in fact fairly simple -- just some 10 lines at https://github.com/ZotPlus/zotero-better-bibtex/blob/scannable-cite/chrome/content/zotero-better-bibtex/cayw.coffee#L133

adam3smith commented 9 years ago

and item.uri doesn't return anything for you?

retorquere commented 9 years ago

that is only set inside the translators.

adam3smith commented 9 years ago

had a first looks. Remaining issues:

  1. Forgot about this: checking "suppress author" should put a minus (or technically hyphen, i.e. -) sign at the beginning of the human readable citation {|-Smith 1776|||zu:123:12314}
  2. The itemID is the wrong one, too: we need the 8 digit alphanumeric one, the same one used e.g for reports. Right now this produces the sequential number of the item (like 345)

When I look at Reference Marks in LibreOffice, I get something like this: "citationItems":[{"id":4480,"uris":["http://zotero.org/users/2433/items/2E869MQK"],"uri":["http://zotero.org/users/2433/items/2E869MQK"],"itemData":{"id":4480,"type":"article-newspaper","title":"עושים לנו טובה גדולה","container-title":"The Marker","language":"he","author":[{"family":"מרב מיכאלי","given":""}],"issued":{"date-parts":[["2011",5,16]]}}}],"schema":"https://github.com/citation-style-language/schema/raw/master/csl-citation.json"} RNDO1GFPvHVVK

Are you not getting those uri(s) returned from the picker? Those have the data we'd need. I this case the 2433 (my personal libraryID) and 2E869MQK (the itemID).

retorquere commented 9 years ago

I think I've found something; could you try http://tempsend.com/7B18CFCAD0 ? I think I now also understand why you use the library ID -- you want these citations to be stable across users whenever possible. That makes me also think you're using the item key rather than the itemID That assumption has been baked into the new version on tempsend.

retorquere commented 9 years ago

We keep crossing messages :) that confirms the key vs itemID.

retorquere commented 9 years ago

The translators get neutered pseudo-objects, very different from code that lives outside the translators. "Suppress author" is now in http://tempsend.com/89B826663B

retorquere commented 9 years ago

The picker doesn't return very much. What it returns looks like

{"citationItems":[{"id":"5","label": "line", "locator":"1","prefix":"see","suffix":"and others","suppress-author":true}],"properties":{}}

All the rest I pick from the DB using this data.

retorquere commented 9 years ago

More questions:

adam3smith commented 9 years ago

OK, I've given it a test spin, working flawlessly all the way through the process.

For your questions:

retorquere commented 9 years ago

(turns out I do have access to the uri, but it just includes the userID if I read the code right, and that's the number I'm after. But I'd rather have verification than trust my code-reading skills)

retorquere commented 9 years ago

Aha! OK, so if I understand correctly, the human readable part should be label year, where label = shorttitle | title | author, first found, and year is either the parsed year or "no date". Right?

retorquere commented 9 years ago

And just in case I got it right: http://tempsend.com/09D7B31421

adam3smith commented 9 years ago

almost: the order for label should be author | shortTitle | title otherwise exactly what you say.

retorquere commented 9 years ago

I don't think the items in the translator return parsed dates. They return whatever the "date" field holds; Zotero offers a function to parse them. Looking at the scannable cite translator I now see it does year | date | no date, first found.

New version that has both these changes at http://tempsend.com/07F03B858A

retorquere commented 9 years ago

I've looked at the scannable cite translator for the legal stuff, but could someone do me a broad walkthrough?

retorquere commented 9 years ago

Ah, I think I understand most of it -- I'll just have a stab at it in the morning.

adam3smith commented 9 years ago

FWIW, for legal item types (the ones listed at the top, some of them only exist in Frank's juris-m (MLZ) fork, same for some of the fields below), use authority, volume reporter pages and precede the date with court, Also, don't do any of the replacing with anon/no title/no date stuff. I'm not enirely clear on the rationale for these, it's somehow related to the bizarre world of legal citations and how Frank solves some of the odder rules there.

retorquere commented 9 years ago

Allright. The latest version at https://github.com/ZotPlus/zotero-better-bibtex/blob/scannable-cite/chrome/content/zotero-better-bibtex/cayw.coffee#L148 makes it possible for others to plug in cayw formatters -- if one defines Zotero.BetterBibTeX.CAYW.Formatter.scannableCite as a function that accepts the picks, it can return whatever string it wants. I'll be happy to add scannable cite, but if you'd rather fiddle with code yourself you could do that in any other plugin should you so desire (and while coffee script happens to be my preference, javascript will do just fine. Just make sure you create the function in init, so BBT has loaded.

adam3smith commented 9 years ago

This is excellent. I'd be very happy if you could just add it -- as I've said, the human readable part is actually the least important bit and everything else looks perfect the way it does.

Thanks for making this happen so quickly. Let me know when you release the version with this included.

retorquere commented 9 years ago

I'll add it when I get home, and then cut a release. I'm happy to have the scannable-cite picker included, I just wanted to leave the option open that @fbennett wanted to tweak the code to his liking. The hooks stay, so it can be done at any time. Pretty happy to see more use of this.

retorquere commented 9 years ago

1.2.25 will have the scannable cite translator.

retorquere commented 9 years ago

BTW I could just use citeproc to assemble the label like I do for the atom picker: https://github.com/ZotPlus/zotero-better-bibtex/blob/master/chrome/content/zotero-better-bibtex/cayw.coffee#L257