zotero / zotero-ios

Zotero for iOS
https://apps.apple.com/us/app/zotero/id1513554812
Other
226 stars 28 forks source link

citeproc-rs integration #262

Open cormacrelf opened 3 years ago

cormacrelf commented 3 years ago

Hi guys!

I have built the beginnings of some Swift bindings for citeproc-rs over here: https://github.com/cormacrelf/CiteprocRsKit. It should be quite easy to build with Carthage. The new multi-platform/multi-architecture .xcframework format is pretty great and Carthage builds them now, it's also next to no configuration and very little overhead/things to learn, so I strongly recommend doing that. I made a toy test app and it runs on my phone, so yeah, it works.

Re building it: The build will take a long time as it builds the entire thing twice for macOS, twice for iphonesimulator, and once for iOS. Fortunately, it only builds when you run carthage commands, not every time you clean your own Xcode project. If you keep the Carthage/Checkout directory on your computer it will be fast to rebuild when I push updates to the Rust. if you set up e.g. GitHub Actions for Zotero iOS you will need to get some caching of the Carthage/Checkouts folder and it will use a few GB at least. I think I will get some pre-built .xcframeworks going on my own CI so this is unnecessary. (You can download & add these to Xcode basically. Swift Package Manager can consume .xcframeworks as a URL and a checksum. There is fast-moving work on a package registry that will end up as a GitHub Packages registry for swift incl binaries.)

Re the FFI wrapping and what's in there: Most of what's in there now is a build script to make Xcode and Rust/Cargo play nice, and some Swift type wrappers around the generated C headers, Swift strings <--> char * <--> Rust Strings, etc. The actual C FFI lives in the ffi branch for the time being i.e. https://github.com/zotero/citeproc-rs/tree/ffi. I haven't done the error handling work (tricky) so at the moment it is panicky instead, but that's next. To keep things simple I am sticking with JSON for converting between all the big types that cross the FFI boundary rather than building an explosion of e.g. citeproc_rs_reference_add_date_field(...) functions, so making an API for you is quite simple.

The remaining questions are pretty much just requirements from your end. I am aware of two general reqs:

I have the first one there already, which you can see as driver.one_ref_citation(reference: Any). The test suite exercises all the API it has so far. So have a look at that and see what you think/tell me what you need and I'll go from there! Cheers.

dstillman commented 3 years ago

Great!

@michalrentka can comment on the Swift details.

On requirements, I think it'll just be the equivalent of right-click → "Create Bibliography from Items" in the desktop client: one or more items as a citation or bibliography with a given style and locale as text and HTML. (For note styles, citation means note.) If multiple items are selected for a citation, it's a multi-item citation. We were originally thinking it would be equivalent to ZoteroBib, which only has single-item citations and full-list bibliographies, but this will be used by more advanced users, and other integration options on iOS are basically non-existent, so I think we'll want to support multi-item citations.

michalrentka commented 3 years ago

Thanks for all the info. The only comment I have right now is, that we're using SwiftPM for dependency management. But if you can't support SwiftPM currently I can add Carthage to the project as well.

cormacrelf commented 3 years ago

Being new to Swift it took me a while to figure out that I can just upload prebuilt .xcframeworks to GH releases, with a simple auxiliary GitHub repo that only has a Package.swift pointing to the URL + checksum, and then you can just use SwiftPM with that auxiliary repo. Very simple. I have a spare Mac that can do it but manual upload is just fine. I'll get that going tomorrow.

(Carthage is still an option if you want to build it yourself. It's how I will build the .xcframeworks for upload, so they'll be identical.)

cormacrelf commented 3 years ago
dependencies: [
    .package(name: "CiteprocRsKit", url: "https://github.com/cormacrelf/CiteprocRsKit-Binary", from: "0.1.0-alpha.1"),
]

Or in the case of this repo's xcodeproj, add via the UI with that URL.

Don't go using that in a real live beta yet but it's a start.

stakats commented 3 years ago

Should we start experimenting with this integration, or would it be better to wait until most of the open issues have been closed (for example, zotero/citeproc-rs#77 and higher)?

michalrentka commented 3 years ago

@cormacrelf any updates here? If you have advanced in this could you push a new release to the SPM repo?

cormacrelf commented 3 years ago

Hey guys, I have some new work, which I will try to release in a couple of days or so --

image

That's a quick demo of how the mechanism works -- "setting field to 0" corresponds to "deallocate all the citeproc structures and set it to None so it can never be used again" in the real program. This is also known as "poisoning".

Edit: and yes, all the recent fixes will be available in that build when I do.

michalrentka commented 3 years ago

@cormacrelf that's great news. I was away for a bit so I just got to read this. Do you have any time estimate when this will be available?

cormacrelf commented 3 years ago

@michalrentka v0.2.0 is in the releases now (swift, also available in the binary repo). There is a lot more new stuff than what I described there. Sorry for holding this up.

Sean tells me there's a general performance concern for the citation formatting -- I have a couple of notes in that respect.

Performance-wise, this is the same code that runs all thousand-odd tests in the test suite in 0.3 seconds (multicore, release mode) on my M1 laptop. Without the WebAssembly engines of JavaScript-world getting in the way, it won't be far off that level of performance. The Swift wrapper and data transfer is about as minimal as can be.

That being said, I would recommend, when you have the time, making it run the main formatting operations on a background thread. Big inputs of 1000 references can still take a couple of seconds and formatting these (particularly with disambiguation enabled to any degree) by far dominates the time taken manipulating the cites/clusters/references. I You get about 300 items in a single APA bibliography (big!) before you get perceptible (300ms) lag, so a pretty rare problem. The main gain is if someone accidentally formats their entire 2000-item Zotero library, you would be able to show them a spinner with a back button instead of freezing.

Operating the CRDriver from a dispatch queue would currently require some kind of mutex (in swift: a DispatchSemaphore). But it would also be surprisingly simple for me, almost trivial, to instead split the little mutations and big queries into two classes, whereby the query half (a "snapshot") can be shared between threads even while the mutations continue, simply cancelling any in-flight query if you do any mutations. I think it would be easier to implement, harder to screw up from Swift, and more performant than synchronisation over the entire thing. If you want this to run on a dispatch queue, then I can do that for you, just let me know.

michalrentka commented 3 years ago

@cormacrelf I wanted to test the package in the project. I'm getting SwiftPM error when trying to add the dependency: "Package resolution failed. CiteprocRsKit could not be resolved."

Screen Shot 2021-08-16 at 12 21 38

Any idea what's up with that?

cormacrelf commented 3 years ago

Use the cormacrelf/CiteprocRsKit-Binary repo, not plain CiteprocRsKit. It has a Package.swift file that refers to an uploaded GitHub release asset. It's basically a URL and a checksum, dressed up as an entire repo that Xcode / SwiftPM can use the git tags on to resolve versions. Not super elegant but this is the only way to do it, Swift expects a Package.swift at the root as I recall.

Unfortunately yesterday I found a separate issue with SwiftPM that I'm trying to fix, see https://github.com/cormacrelf/CiteprocRsKit/issues/3 for details

cormacrelf commented 3 years ago

@michalrentka The previous message no longer applies, use the CiteprocRsKit repo directly. I managed to get it working via SwiftPM exclusively, using a precompiled binary for only the Rust parts.

See the release notes at https://github.com/cormacrelf/CiteprocRsKit/releases/tag/v0.4.0

Edit: I'm also now building browsable documentation which you can view here (the link is also on the CiteprocRsKit repo info)