diegodlh / zotero-cita

Cita: a Wikidata addon for Zotero with citations metadata support
GNU General Public License v3.0
235 stars 12 forks source link

Bug: Large batch syncing fails #190

Open Proulx-S opened 2 years ago

Proulx-S commented 2 years ago

Describe the bug Batch syncing of a large number of items a a library fails. Syncing a small number of item at a time works fine.

To Reproduce This is the library I am working with: https://www.zotero.org/groups/4774891/proulxs/library. On my local machine, I select all items, right click, Cita, Sync citations with Wikidata.

Expected behavior The number of items selected for a batch sync should not matter.

Screenshots Screen Shot 2022-09-11 at 6 17 56 PM Screen Shot 2022-09-11 at 7 56 13 PM

Environment:

Additional context List of installed add-ons: ZotFile Folder Import for Zotero

Debug output log: Debug Output.txt Item or collection causing trouble: Exported Items.rdf.zip

Dominic-DallOsto commented 2 years ago

Thanks for the detailed report!

I had a quick look to try and localise this

That's just a quick guess - I will debug this properly soon. I know with reconciling items via to get a QID there was a limit to the query size above which things would break.

Dominic-DallOsto commented 2 years ago

Sorry this has taken me so long! At least on my end I'm getting an out of memory error when I try to sync that many items at once. For each citation we need to download the corresponding item's data from Wikidata, which is about ~1 MB per item. So for 50K citations it's on the order of 5 GB of memory. Do you see significant memory usage on your end too? I then get an error that the item data downloaded from wikidata isn't valid - not sure if this is the real error or as a result of the memory issue.

Does this also run super slowly for you? I'm working to speed this up, but I guess to solve the memory issue things would need to be progressively written to disk.