code4lib / ruby-oai

a Ruby library for building OAI-PMH clients and servers
MIT License
62 stars 42 forks source link

Resumption token incomplete list #96

Closed barmintor closed 2 years ago

barmintor commented 2 years ago

This is a rebased and tested version of the changes in #54 The OAI spec for Flow Control:

the response containing the incomplete list that completes the list must include an empty resumptionToken element;

Previously ruby-oai returned a nil resumption token in these scenarios. This PR changes the serialization of ResumptionToken when the last reference is blank (per @gcolson), and exercises the bundle providers for the change to include an empty token element rather than nil.

jrochkind commented 2 years ago

@barmintor I don't really understand what's going on here at all and probably don't have time to try to so can't really give much meaningful review; but per your request for review, I left one comment on something that seemed to me to be leaving things in a worse state than found.

I trust the test changes result in test that would fail but for the fix in this PR, and you believe are testing the right thing(s)?

barmintor commented 2 years ago

These are good questions! What's happening is that when fetching a large set of records in batches (with a resumptionToken) the library would formerly send no token element back with the last set - this makes sense, but is not actually compliant with the spec, which requires an empty token element in that scenario. Some clients may not draw a distinction between those two approaches to signalling the end of a list, but it's good to be conformant.

As you note, the library previously sent a simple list of records back in the last batch (and thus read the .records of the partial response object) - this allowed the suppression of the resumptionToken element output. This has been replaced with the partial response, but to accomplish the empty resumptionToken element, the token's last reference needs to be nilled out. The current API doesn't support the post-initialization assignment of the @token attribute.

There's another PR that might enable a better approach to this, but this is the best incremental change more-or-less within the scope of this PR.

barmintor commented 2 years ago

Actually @jrochkind looking at it again, there's a refactor that would prevent the @token re-assignment (now in this PR).

jrochkind commented 2 years ago

The current API doesn't support the post-initialization assignment of the @token attribute.

Yep, I meant, then how about changing the API so it does, rather than do an instance_variable_set?

To me, instance_variable_set is a last resort workaround when you have to reach into something in third-party code that doesn't supply an API to do so. But if it's in our own gem where we DO control the API, we should fix it, rather than use that workaround. Unless it's like an urgent emergency and we need to fix it but don't have time to fix the API... but this long-standing issue does not seem to be an urgent emergency?

But looks like you found an alternate path.

In general, if I'm really reviewing, i'm going to blokc on instance_variable_set, unless there's a good reason for it other than "we just don't feel like fixing the API to support our own internal use-case"! I hope that makes some sense.

I notice this still has an instance_variableset in here... but at least its' only in test code. ¯\_(ツ)/¯. Still seems regrettable to me, a bomb left waiting for future maintainers. On something with very little maintenance resources like this, I try to "first do no harm", not leave the code worse than it was found.

barmintor commented 2 years ago

@jrochkind it makes good sense! I am prioritizing the evaluation of these PRs with the idea that client code of the contributors should still work, until this project is a little more out of the weeds - so avoiding API changes if possible.

Since I am due to release a related gem next week, I feel like there is a little urgency in fixing known bugs.

I think the test hack you mention could be fixed by investigating the fixture data and adjusting the tested queries, which I am tracking in #98

jrochkind commented 2 years ago

Thank you for your work!

I figure, if the bugs have existed for years, it's probably not urgent to get them fixed by next week! (I am curious if this particular bug was effecting anyone, certain clients or whatever?) Hurrying can make mistakes. We can do a release of this gem anytime, multiple releases even, I don't know it needs to be synchronized with your other gem?

Anyway, in any event, thank you for your work!