Closed michi-zuri closed 4 years ago
Regarding partial sync, some facts to consider:
For further reading on this topic I suggest: https://www.zotero.org/support/dev/web_api/v3/syncing#partial-library_syncing
There are two separate issues here, as you say:
I hadn't considered the possibility that when calling last_modified_version
, there could already be a potentially stored version header due to a previous API call. I also didn't think about making a cheaper API call than items()
in order to retrieve headers if no existing headers were available. This will be easy to fix, I think.
I need to figure out whether Library l.m.v is what is always wanted in this case.
OK, so thinking about this a little more, and after reading the documentation:
You can't rely on inspecting self.request.headers
in order to get the library version, because only methods (i.e. items()
, top()
) that return multiple objects return the latest library version (see version numbers
).
So it's necessary to retain an API call in the latest_library_version()
method. However, I've now changed the call to have a limit
of 1, so the method should be much faster (in the 100s of milliseconds, depending on the network).
I can't recreate a situation in which multiple calls to latest_library_version()
return different values when using the user
library type, without making a write API call first.
If using a groups
library type, multiple calls can of course return different values, because of concurrent access by multiple users.
I can't recreate a situation in which multiple calls to
latest_library_version()
return different values when using theuser
library type, without making a write API call first.
Hmm, I can't recreate that situation either now, I must have mixed up something in my mind.
only methods (i.e. items(), top()) that return multiple objects return the latest library version
Yeah, that seems to be correct!
I only just now realised, that the Zotero API only memorizes one newest library version which is called Last-Modified-Version. What I wrote before about partial sync not being implementable without knowing the oldest library version that is identical with the newest for a particular subset: I was wrong. I think this issue can be closed.
Currently, the documentation for
Zotero.last_modified_version()
states:While this description of the function seems to be sometimes true (if called after a request like
Zotero.items()
orZotero.count_items()
), it does not seem true if called after requests to group metadataZotero.groups()
, at least in my testing. The latter returned the last modified version of the group's metadata and not of the library data...After digging into the code I am confused about my empirical results, since the two-step method seems to first make a call to self.items() and only then returns the headers of the last response before calling this method:
Therefore I now think this is a bug, instead of faulty documentation. The documentation describes what the method should be doing if you look at the code, but could it be that the function returns a value before the request has received a response from the Zotero server for the
self.items(**kwargs)
call? Maybe this behaviour has changed since the introduction of this method? I can't tell.For what it matters, I tested this with
Python v3.8.5
andpip freeze
tells me that I am usingPyzotero==1.4.16
andrequests==2.24.0
.Now that I tried a different approach and just inspect the Zotero.request.headers directly without calling a separate method, I could shave off 4.5 seconds from the sync duration, hooray:
If this is a bug that will eventually be fixed, there is a need for an alternative way to access response headers of the last made request. Without direct access to the
Last-Modified-Version
header it is not possible to implement Partial Sync efficiently. However, as it seems possible to directly access the latest response headers inZotero.request.headers
, I am hesitant wether a method would be the right way to expose them, like so:And I'm not sure how to fix the behaviour of last_modified_version(), assuming it is indeed a bug... Hope this helps.