zotero / citeproc-rs

CSL processor in Rust.
https://cormacrelf.github.io/citeproc-wasm-demo/
Other
75 stars 11 forks source link

"RuntimeError: unreachable executed" when calling batchedUpdates #31

Open adomasven opened 4 years ago

adomasven commented 4 years ago
Zotero debug log ``` (5)(+0000004): CiteprocRs: new Driver (5)(+0000001): CiteprocRs: new Driver (5)(+0000008): CiteprocRs: setClusterOrder [] (3)(+0000000): CiteprocRs: updateUncitedItems not available in citeprocRs (5)(+0000484): CiteprocRs: insertReference {"id":"1301","type":"chapter","abstract":"When John A. Hannah announced his retirement as president of Michigan State University to become Director of the U.S. Agency for International Development, it marked the end of an era. For twenty-eight years, he had done all the things which Clark Kerr says a university president must do; he was a “friend of the students, a colleague to the faculty, a good fellow with the alumni, a sound administrator with the trustees, a good speaker with the public, an astute bargainer with the foundations and the federal agencies, a politician with the state legislature, a friend of industry, labor, and","container-title":"The Test","ISBN":"978-0-87013-648-1","page":"17-32","publisher":"Michigan State University Press","source":"JSTOR","title":"THE STAGE IS SET","URL":"http://www.jstor.org/stable/10.14321/j.ctt130hjnp.5","editor":[{"family":"ADAMS","given":"WALTER"}],"accessed":{"date-parts":[["2017",10,20]]},"issued":{"date-parts":[["2003"]]}} (5)(+0000001): CiteprocRs: insertCluster {"id":1,"cites":[{"id":"1301"}]} (5)(+0000000): CiteprocRs: setClusterOrder [{"id":1}] (5)(+0000000): CiteprocRs: builtCluster 1 (5)(+0000002): CiteprocRs: insertReference {"id":"1301","type":"chapter","abstract":"When John A. Hannah announced his retirement as president of Michigan State University to become Director of the U.S. Agency for International Development, it marked the end of an era. For twenty-eight years, he had done all the things which Clark Kerr says a university president must do; he was a “friend of the students, a colleague to the faculty, a good fellow with the alumni, a sound administrator with the trustees, a good speaker with the public, an astute bargainer with the foundations and the federal agencies, a politician with the state legislature, a friend of industry, labor, and","container-title":"The Test","ISBN":"978-0-87013-648-1","page":"17-32","publisher":"Michigan State University Press","source":"JSTOR","title":"THE STAGE IS SET","URL":"http://www.jstor.org/stable/10.14321/j.ctt130hjnp.5","editor":[{"family":"ADAMS","given":"WALTER"}],"accessed":{"date-parts":[["2017",10,20]]},"issued":{"date-parts":[["2003"]]}} (5)(+0000001): CiteprocRs: insertCluster {"id":2,"cites":[{"id":"1301"}]} (5)(+0000000): CiteprocRs: setClusterOrder [{"id":2}] (3)(+0000000): CiteprocRs: batchedUpdates zotero(3)(+0000003): RuntimeError: unreachable executed __rust_start_panic@resource://zotero/loader.jsm -> resource://zotero/citeproc-rs.js:2838407:1 rust_panic@resource://zotero/loader.jsm -> resource://zotero/citeproc-rs.js:2829322:1 std::panicking::rust_panic_with_hook::h5e7c2dc110ae79d4@resource://zotero/loader.jsm -> resource://zotero/citeproc-rs.js:2631309:1 std::panicking::continue_panic_fmt::hb5b3e4b5160fe2ab@resource://zotero/loader.jsm -> resource://zotero/citeproc-rs.js:2784297:1 std::panicking::begin_panic_fmt::h62d8474306d91923@resource://zotero/loader.jsm -> resource://zotero/citeproc-rs.js:2806028:1 >::execute::h4d094d2a5a421962@resource://zotero/loader.jsm -> resource://zotero/citeproc-rs.js:2222173:1 salsa::derived::slot::Slot::read_upgrade::hcacf00312d2ed625@resource://zotero/loader.jsm -> resource://zotero/citeproc-rs.js:788038:1 as salsa::dependency::DatabaseSlot>::maybe_changed_since::h88f4ab6124f33143@resource://zotero/loader.jsm -> resource://zotero/citeproc-rs.js:1357987:1 salsa::derived::slot::Slot::read_upgrade::he2ab721d44835f2f@resource://zotero/loader.jsm -> resource://zotero/citeproc-rs.js:743872:1 as salsa::dependency::DatabaseSlot>::maybe_changed_since::h065b302de8820459@resource://zotero/loader.jsm -> resource://zotero/citeproc-rs.js:1217350:1 salsa::derived::slot::Slot::read_upgrade::h600f853d8bc7c867@resource://zotero/loader.jsm -> resource://zotero/citeproc-rs.js:740531:1 salsa::derived::slot::Slot::read::hc6cc882bbd08e744@resource://zotero/loader.jsm -> resource://zotero/citeproc-rs.js:1828830:1 as salsa::plumbing::QueryStorageOps>::try_fetch::h57fccbdeddc4958e@resource://zotero/loader.jsm -> resource://zotero/citeproc-rs.js:1339826:1 citeproc::db::update::UpdateSummary::summarize::h78c8d69619de9e2a@resource://zotero/loader.jsm -> resource://zotero/citeproc-rs.js:1553687:1 citeproc::db::Processor::batched_updates::h876f2355ffba4f02@resource://zotero/loader.jsm -> resource://zotero/citeproc-rs.js:709681:1 citeproc_wasm::Driver::batched_updates::h121e0b0bcb80bfdd@resource://zotero/loader.jsm -> resource://zotero/citeproc-rs.js:1991695:1 driver_batchedUpdates@resource://zotero/loader.jsm -> resource://zotero/citeproc-rs.js:2811953:1 batchedUpdates@resource://zotero/loader.jsm -> resource://zotero/citeproc-rs.js:317:21 getBatchedUpdates@chrome://zotero/content/xpcom/citeproc-rs-to-citeproc-js.js:130:27 Zotero.Integration.Session.prototype._updateCitationsCiteprocRs@chrome://zotero/content/xpcom/integration.js:1900:25 Zotero.Integration.Session.prototype._updateCitations@chrome://zotero/content/xpcom/integration.js:1822:12 Zotero.Integration.Fields.prototype.updateDocument<@chrome://zotero/content/xpcom/integration.js:1042:9 tryCatcher@resource://zotero/loader.jsm -> resource://zotero/bluebird/util.js:16:16 module.exports/PromiseSpawn.prototype._promiseFulfilled@resource://zotero/loader.jsm -> resource://zotero/bluebird/generators.js:97:18 From previous event: captureStackTrace@resource://zotero/loader.jsm -> resource://zotero/bluebird/debuggability.js:829:23 CapturedTrace@resource://zotero/loader.jsm -> resource://zotero/bluebird/debuggability.js:721:5 longStackTracesCaptureStackTrace@resource://zotero/loader.jsm -> resource://zotero/bluebird/debuggability.js:400:19 module.exports/Promise.prototype._then@resource://zotero/loader.jsm -> resource://zotero/bluebird/promise.js:232:9 module.exports/Promise.prototype._passThrough@resource://zotero/loader.jsm -> resource://zotero/bluebird/finally.js:94:12 module.exports/Promise.prototype.finally@resource://zotero/loader.jsm -> resource://zotero/bluebird/finally.js:103:12 PromiseSpawn@resource://zotero/loader.jsm -> resource://zotero/bluebird/generators.js:36:25 module.exports/Promise.coroutine/<@resource://zotero/loader.jsm -> resource://zotero/bluebird/generators.js:197:21 Zotero.Integration.Interface.prototype.addEditCitation@chrome://zotero/content/xpcom/integration.js:639:12 ```

Attached above is the debug log for the full set of operations before the error is reached.

cormacrelf commented 4 years ago

Ok, unreachable in this case isn’t an unreachable!() call, it’s just that this build executes WASM’s unreachable instruction when it panics, so you don’t get to see the panic message. An even better report would be attained with --features console.

However, in this case, it is almost certainly this line:

panic!("called cite_position on unknown cite id, {:?}", key);

In this case, UpdateSummary::summarize() is requesting a built cluster for a cluster that isn’t in the document. Why? Because you called builtCluster manually, without calling drain() afterwards, so the processor considers this to be one of the “touched” clusters that needs to be reported as a change; it got added to the update queue. Filtering out any clusters in the queue that shouldn’t be included would fix this, but you should still make sure you call drain() to get optimally small updates. PR incoming to fix and to document better.

Perhaps a better API for that use case would be only “give me every cluster and then drain the update queue, consider this point to be client and processor in sync”, although it would be a bit limiting.

cormacrelf commented 4 years ago

Are you calling builtCluster for the same reason as the demo does? To initialise things? It doesn’t seem like it — you call it after manipulating the document, which is a use case better handled by batchedUpdates as it handles draining the queue itself. Maybe if I could see your code I could help figure out the best way to use the API.

adomasven commented 4 years ago

An even better report would be attained with --features console.

Where would I pass this flag?

Because you called builtCluster manually, without calling drain() afterwards, so the processor considers this to be one of the “touched” clusters that needs to be reported as a change

Why is that? If anything, I thought builtCluster() would remove something from the later-to-be-generated update summary, since we already got to see it in updated form.

but you should still make sure you call drain() to get optimally small updates

Based on the documentation I assumed you only call drain if you did a bunch of updates and don't care about the update summary, so you call drain() to.. drain it. I don't really see the use for something like that either, it's an odd method.

Perhaps a better API for that use case would be only “give me every cluster and then drain the update queue, consider this point to be client and processor in sync”, although it would be a bit limiting.

This was my assumption about what batchedUpdates() does. Do you still have to manually call drain()?

Are you calling builtCluster for the same reason as the demo does? To initialise things?

I cannot really get the demo to work easily (or the wasm bindings, to be honest, not without the wasm-pack output manual edits, but that can be fixed later, since god know where upstream the problems lie (and/or whether the xpcom environment is just way too weird), and there is a bit of an overwhelming amount of stuff in it, which makes it hard to see what the code flow is, or figure out what data gets fed into the Driver just by reading the code.

I call builtCluster() because I am simulating a call to previewCitationCluster() for citeproc, since we need to generate a text preview of an inserted citation in Zotero for the user interface before a final insertion into the doc.

cormacrelf commented 4 years ago

As a flag to wasm-pack after a -- separator, off the top of my head.

Yeah look, builtCluster is not a great API. I will make it more intuitive. But yes, drain means “I’m in sync, all the updates you have are already known to me” so this would be a place to call it. I will try to remove the drain API as you shouldn’t have to think about it at all.

(The reason is that builtCluster is a very thin wrapper over built_cluster. Actually running built_cluster is “recomputing” a cluster, so if it’s called and has to recompute itself and not merely look up a previously computed value, then an update is put in the queue. In your log, it did “recompute” cluster id 1. Your intuition is how it should actually behave.)

For previewCitationCluster, is it a requirement that you get a position-correct preview? I can probably do better than a simulation from inside the processor, because you can just pretend it’s the last cluster wherever it may be, figure out what positions the cites would have, build the cluster from them, and forget about the whole reordering before/after dance.

For the demo, all of the cluster manipulation is in one file, and almost just the one Document class in Document.tsx, so you can just ignore everything else. Everything else is UI, except useDocument.tsx, which is references and locales and has to be weird because React is, but I wrote a straightforward version at the end to clarify what it’s doing. I would have thought Document would be easy to understand, as it does very little. I guess it also reference-counts the references, but I figured that would be useful too. Probably a bit complicated that everything change happens inside produce(), but that’s how to ensure you never forget to call batchedUpdates() and re-render, and I’d do it again in any other codebase.

I don’t know why you’d be having trouble getting the demo app to work — it starts itself, compiles the processor and opens itself in a browser with a single invocation of yarn && yarn start, pretty hard to mess that up? Were you trying to compile it with raw tsc calls or something?

I’m a little more concerned about the manual edits you mention. Nothing in JavaScript is ever simple, but it really is pretty standard JS when you build it like we discussed in that other issue. Are you working on a branch that I can look at?

adomasven commented 4 years ago

For previewCitationCluster, is it a requirement that you get a position-correct preview?

Yes. Although implementing this as a client of the library isn't too bad.

I’m a little more concerned about the manual edits you mention. Nothing in JavaScript is ever simple, but it really is pretty standard JS when you build it like we discussed in that other issue.

I'm just replacing the final line of the file generated with wasm-pack --target no-modules from

self.wasm_bindgen = Object.assign(init, __exports);

to

module.exports = init;
Object.assign(module.exports, __exports);

since self is not defined in the environment, so nothing major, but it's still a bit of a pain. I was kinda going into setting this up with a mindset of "let's throw this together quickly to test in Zotero" and haven't really set up a proper Zotero packaging script, since I keep on running into issues and wanted to get that sorted before looking for "proper" solutions of papercuts.

However, this issue also kinda exacerbates the annoyance with the js-demo, since it requires a different wasm-pack build and undoes my manual changes.

As for the js-demo, it's best feature: batteries included is also its worst feature, since the bulk of the code is setting up UI and React stuff. But also it would help a lot if it used its own instance of bindings, otherwise you need to clone out a certain not-entirely-obvious part of the project somewhere else if your build of wasm-bindings is different from what the ones that js demo uses and you want to have it running alongside whatever you are doing in the project. It helps most when you have it running and are able to inspect the variables during runtime (since following control flow by reading code is not easy, nor necessary, if you just care about a specific API call).

For the demo, all of the cluster manipulation is in one file, and almost just the one Document class in Document.tsx, so you can just ignore everything else. Everything else is UI, except useDocument.tsx, which is references and locales and has to be weird because React is, but I wrote a straightforward version at the end to clarify what it’s doing.

In #26 I was getting an error when feeding references to the processor which was producing errors. Looking at Document.ts alone just wasn't enough and useDocument.ts was too dense and too much effort to understand compared to getting the demo to work and using a debugger. This would be easier had I had a chance to use hooks in React, but I have not.

cormacrelf commented 4 years ago

https://developer.mozilla.org/en-US/docs/Web/API/Window/self ... but I appreciate this is probably an XPCOM thing. You could always just put if (!window.self) window.self = window; in your codebase somewhere, taking care not to overwrite if already present as the property is meant to be read-only.

With the separate build annoyance, either of us can improve that by moving it from pkg to somewhere else since the destination directory is configurable. wasm-pack -h for the particular flag, I think it’s -o.

As for following control flow, personally, I rarely if ever use breakpoints at runtime. I just have a good editor and use Go To Definition liberally (absent that, a nice shortcut for searching the codebase for the word under the cursor).