srnsw / State-Records-Search

Java version of api.records.nsw.gov.au
3 stars 1 forks source link

OAI-PMH resumption tokens #34

Closed richardlehane closed 10 years ago

richardlehane commented 11 years ago

The resumption tokens used for OAI-PMH are returning the wrong range of records. They are returning duplicates that get progressively worse the further you get.

E.g. view http://search.records.nsw.gov.au/oai?verb=ListRecords&set=agencies&resumptionToken=rif:agencies:::2 http://search.records.nsw.gov.au/oai?verb=ListRecords&set=agencies&resumptionToken=rif:agencies:::3

You'll see that there is overlap in the records returned. This causes a lot of duplication in harvesting. E.g. when ANDS went to harvest agencies they retrieved 33948 objects for only 3586 unique agencies.

wisanup commented 11 years ago

fixed