code4lib / ruby-oai

a Ruby library for building OAI-PMH clients and servers
MIT License
62 stars 42 forks source link

client list_records.full errors with resumption token error (sometimes) #107

Open yulgit1 opened 1 month ago

yulgit1 commented 1 month ago

Running a daily script with a full harvest block like:

client.list_records(:metadata_prefix => 'marc21').full.each do |record|

~70% of the time it succeeds and pulls all ~60000 records

~30% of the time it fails to some degree of incompletion of the 60000 with the stack trace:

/home/ermadmix/.rvm/gems/ruby-3.3.0/gems/oai-1.2.1/lib/oai/client/response.rb:49:in `initialize': Illegal argument1 'resumptionToken' (OAI::ArgumentException)
    from /home/ermadmix/.rvm/gems/ruby-3.3.0/gems/oai-1.2.1/lib/oai/client.rb:224:in `new'
    from /home/ermadmix/.rvm/gems/ruby-3.3.0/gems/oai-1.2.1/lib/oai/client.rb:224:in `block in do_resumable'
    from /home/ermadmix/.rvm/gems/ruby-3.3.0/gems/oai-1.2.1/lib/oai/client/resumable.rb:15:in `each'
    from harvest_marc_oai.rb:17:in `<main>'

Like the resumption token process seems to work for a while and then stops working.

I also tried a date range:

start_date = (Date.today - 5).strftime("%Y-%m-%d")
end_date = Date.today.strftime("%Y-%m-%d")
client.list_records(:metadata_prefix => 'marc21', :from => start_date, :until => end_date).full.each do |record|

After returning ~11 records, that resulted in:

/home/ermadmix/.rvm/gems/ruby-3.3.0/gems/oai-1.2.1/lib/oai/client/response.rb:49:in `initialize': either 'metadataPrefix' or 'resumptionToken' is required (OAI::ArgumentException)
    from /home/ermadmix/.rvm/gems/ruby-3.3.0/gems/oai-1.2.1/lib/oai/client.rb:224:in `new'
    from /home/ermadmix/.rvm/gems/ruby-3.3.0/gems/oai-1.2.1/lib/oai/client.rb:224:in `block in do_resumable'
    from /home/ermadmix/.rvm/gems/ruby-3.3.0/gems/oai-1.2.1/lib/oai/client/resumable.rb:15:in `each'
    from harvest_marc_oai.rb:20:in `<main>'

Why this error? Clearly there is a metadata_prefix.

Any ideas appreciated!