Update worldcat api deprecated urls (#1)

LibraryOfCongress / chronam

This software project is no longer being actively developed at the Library of Congress. Consider using the Open-ONI (https://github.com/open-oni) fork of the chronam software. Project mailing list: http://listserv.loc.gov/archives/chronam-users.html.

71 stars 34 forks source link

Update worldcat api deprecated urls (#1) #266

Closed dillonpeterson closed 1 year ago

dillonpeterson commented 1 year ago

Added subclass for SearchAPI Request that allows Chronam to pull titles from updated Worldcat SRU search URL

Pull_Titles script was erroring because OCLC changed the worldcat API search URL to http://worldcat.org/webservices/catalog/search/worldcat/sru when it used to be http://worldcat.org/webservices/catalog/search/sru.

The WorldCat package (https://github.com/anarchivist/worldcat) that Chronam relies on doesn't have the updated API URL yet; Therefore, I simply subclassed the SearchAPIRequest from WorldCat package on PYPI, updated the URL, and replaced the WorldCat SRURequest in title_pull with the new subclass.

@acdha please let me know of any requests for modification.

acdha commented 1 year ago

Technically there's one other option here:

from __future__ import print_function

import urllib2

opener = urllib2.build_opener()
opener.addheaders = [("User-agent", "Custom")]
urllib2.install_opener(opener)
print(urllib2.urlopen(urllib2.Request("http://httpbin.org/headers")).read())

I'm leaning against that approach since it affects everything using urllib2 but it would potentially allow us to avoid having to touch the http_open method.