NYPL-Simplified / circulation

Circulation manager for Library Simplified
Other
19 stars 19 forks source link

Evaluate Enki eBook Library API #125

Closed jce1028 closed 7 years ago

jce1028 commented 8 years ago

The following APIs are avialable for Enki Content host for CALIFA libraries. The user APIs allow patron validation as well as circulation functions. The 'lib' variable needs to have the number of the vufind library. The id is the record id from the econtent_record. http://www.enkilibrary.org/API/UserAPI?method=validateAccount&username=21901000000244&password=mckeegan&lib=1 http://www.enkilibrary.org/API/UserAPI?method=isLoggedIn http://www.enkilibrary.org/API/UserAPI?method=placeEContentHold&username=21901000000244&password=mckeegan&lib=1&recordId=12 http://www.enkilibrary.org/API/UserAPI?method=checkoutEContentItem&username=21901000000244&password=mckeegan&lib=1&recordId=12 http://www.enkilibrary.org/API/UserAPI?method=activateEContentHold&username=21901000000244&password=mckeegan&lib=1&recordId=10509 http://www.enkilibrary.org/API/UserAPI?method=cancelEContentHold&username=21901000000244&password=mckeegan&lib=1&recordId=10509 http://www.enkilibrary.org/API/UserAPI?method=getPatronCheckedOutEContent&username=21901000000244&password=mckeegan&lib=1 http://www.enkilibrary.org/API/UserAPI?method=getPatronHoldsEContent&username=21901000000244&password=mckeegan&lib=1 http://www.enkilibrary.org/API/UserAPI?method=getPatronProfile&username=21901000000244&password=mckeegan&lib=1 http://www.enkilibrary.org/API/UserAPI?method=returnEContentRecord&username=21901000000244&password=mckeegan&lib=1&id=38743

The item APIs need to have 'econtentRecord' preappended to the econtent record id. I am going to fix that. http://enkilibrary.org/API/ItemAPI?method=getExpiredOnlineItemsWithHolds http://enkilibrary.org/API/ItemAPI?method=getOnlineItemsToReturn http://enkilibrary.org/API/ItemAPI?method=getOverdueOnlineItems http://enkilibrary.org/API/ItemAPI?method=getBasicItemInfo&id=econtentRecord11720 http://enkilibrary.org/API/ItemAPI?method=getBookCover&id=econtentRecord11720 http://enkilibrary.org/API/ItemAPI?method=getBookcoverById&id=econtentRecord11720 http://enkilibrary.org/API/ItemAPI?method=getItem&id=econtentRecord11720 http://enkilibrary.org/API/ItemAPI?method=getItemAvailability&id=econtentRecord11720

The SearchAPIs allow for searching in various ways. http://enkilibrary.org/API/SearchAPI?method=getListWidget http://enkilibrary.org/API/SearchAPI?method=getRecordIdForTitle&title=Cow%20in%20the%20parking%20lot http://enkilibrary.org/API/SearchAPI?method=getSearchBar http://enkilibrary.org/API/SearchAPI?method=getTitleInfoForISBN&isbn=9780761171997 http://enkilibrary.org/API/SearchAPI?method=getTopSearches http://enkilibrary.org/API/SearchAPI?method=search&lookfor=cow%20in%20the%20parking%20lot&type=title http://enkilibrary.org/API/SearchAPI?method=search&lookfor=cats&type=keyword

These relate to the book river and the searches that have been created: http://enkilibrary.org/API/ListAPI?method=GetListTitles&id=search:41029 http://enkilibrary.org/API/ListAPI?method=GetListTitles&id=search:random http://enkilibrary.org/API/ListAPI?method=GetListTitles&id=newEpub http://enkilibrary.org/API/ListAPI?method=GetListTitles&id=availableEpub http://enkilibrary.org/API/ListAPI?method=getPublicLists http://enkilibrary.org/API/ListAPI?method=getRandomSystemListTitles http://enkilibrary.org/API/ListAPI?method=getRSSFeed&id=newebooks http://enkilibrary.org/API/ListAPI?method=getRSSFeed&id=search:675296

leonardr commented 8 years ago

My initial evaluation:

HTTPS

HTTPS is set up on www.enkilibrary.org and appears to work fine. We should use HTTPS exclusively when communicating with this API.

Patron info

All of the patron methods give me a 500 error. I tried both GET and POST on URLs like:

http://www.enkilibrary.org/API/UserAPI?method=validateAccount&username=21901000000244&password=mckeegan&lib=1

With a random made-up username and password I got the same error, not a 401 error.

Metadata

I used the getItem API call on three sets of 100 consecutive econtentRecords: 1-100, 1600-1700, and 11700-11800. Here's what I found by looking at those 300 records:

id - Always present, e.g. "econtentRecord11720".

isbn - Always present, e.g. "9781440230387". These are the same ebook-specific ISBNs used by Overdrive and 3M, which means it's a crapshoot whether OCLC and Content Cafe will have records of them. This makes it more important (relatively speaking) that we get reliable data direct from the content source.

upc - The only value I saw for this was 'English'. Seems like a server-side error.

issn - Never saw a value.

"allIsbn", "allUpc", and "allIssn" are also present. I never saw them have more than one value -- the corresponding value for isbn or upc.

author - A single string in "Last, First." format. Only one name is ever present, even when OCLC says a book has multiple contributors. (e.g. "Silversmithing" by Rupert Finegold and William Seitz; Enki author field is "Finegold, Rupert.")

cover - The link provided is to a 'medium' version. 'small' and 'large' are also available by hacking the URL. 'large' is not as big as Overdrive or 3M (usually 300x400) but is big enough to serve as a source for thumbnails.

Example: https://enkilibrary.org/bookcover.php?id=econtentRecord11720&isbn=9781440229220&upc=&category=EMedia&format=&size=large

description - Usually plain text, occasionally HTML. Prose is good quality.

publisher - Plain text.

language - "English" is the only value I saw. Presumably the names of the languages are themselves in English.

format - Two distinct values seen: {"Adobe PDF":"Adobe PDF"} {"EPUB":"EPUB"}

formatCategory - Only value seen is "EMedia". Probably used to distinguish between physical items and e-items.

ratingData - Detailed information about user ratings on a 5-star system. Of the 300 books I checked, only 13 had any ratings.

tagList - No data present.

Availability

Availability data is kept in the 'holdings' section of the item info. 'holdings' is a list but I never saw the list contain more than one item. These keys make me think that the 'holdings' section is scoped to the currently authenticated user:

I never saw a book that was checked out or on hold. These keys only make sense if the API is telling you whether you have the book checked out on hold. However, I also didn't see any indicators of whether a book was available -- i.e. whether you could get it right now with checkoutEContentItem or whether you had to put it on hold with placeEContentHold.

In the search results (but not in the main API) there is a key called 'numHoldings' which I expect is the total number of licenses for a book currently held by the library:

"num_holdings":3

Missing features (essential)

Detailed author information

a) I need the name of every author, not just one author. (This is essential)

b) If some of the authors were not 'authors' per se but made other contributions to the book (e.g. illustrator, translator, author of forward), this role information should be present. (This is nice-to-have)

I know that all authors are in the dataset because I can see them in search results:

"author2":["Finegold, Rupert.","Seitz, William,"]

This item from the search index indicates that role information is probably not in the dataset, but it might be present but ignored:

"title":"Abraham Lincoln's Law Notes", "author2":["Lincoln, Abraham","Dirck, Brian","Ceresi, Frank"] "description":"Before he was President, Abraham Lincoln was a lawyer, and he wrote this little-known essay with his advice on law practice, from avoiding litigation to practicing extemporaneous speaking. Above all, is integrity. \"[R]esolve to be honest at all events; and if in your own judgment you cannot be an honest lawyer, resolve to be honest without being a lawyer.\" Former judge Frank Ceresi provides commentary and the book includes a thoughtful introduction by Professor Brian Dirck, the foremost expert on Lincoln's career as a lawyer."

Here, Brian Dirck and Frank Ceresi are credited as authors alongside Abraham Lincoln. If I looked up this book in Overdrive, Dirck would be credited as "author of introduction" and Ceresi would probably be credited as "compiler".

Classifications

By this I mean a book's classification under BISAC headings, or however these books come classified by the publisher/distributor. Without this information we can't classify a book as fiction/nonfiction, or understand its genre or anything about its intended audience. I would expect to see this information in 'tagList' but 'tagList' is always empty.

I know this information is in the dataset because I can see it used in search results:

"subject_facet":["PETS Reference","Cats","PETS Cats General","Pets"]

This looks like a processed version of the BISAC headings "PETS / Reference" and "PETS / Cats / General".

Collection overview

At any given time Simplified needs to have a reasonably up-to-date view of the collection: which books are in the collection, which books are available right now, how many licenses are in the collection for those books (this is probably the num_holdings I mentioned earlier), how many copies of those books are available, and how many people are in the hold queue for the books that are not available.

In particular, we need to be able to ask 'how has the collection changed in the past five minutes?' We need to be able to find out about any books added or removed from the collection, and any loans and holds that have been executed since a given time.

I don't think this is possible right now; the focus seems to be on showing how the collection looks to a particular patron. This isn't useful to us, because we send the same OPDS feeds to everyone.

Missing features (nice to have)

Target Audience / Age range

It looks like this dataset has the same problem as the 3M dataset: YA books are filed under "JUVENILE FICTION" and "JUVENILE NONFICTION", instead of "YOUNG ADULT FICTION" and "YOUNG ADULT NONFICTION". This makes it very difficult to determine whether a book is intended for 14-year-olds or 2-year-olds. If this sort of audience information is anywhere else in the dataset, it would be very useful to get it through the API.

This is very important, but we already have to deal with a case where this data is missing (3M), so I'm classifying it as nice-to-have. If the information is not available we know how to muddle through.

Series information

If a book is part of a series it's useful to know the name of the series.

Publication date

This is present in the search index but not in the API.

"publishDate":["2011"]

Miscellaneous questions/suggestions

What does 'validateAccount' do? Does it give me a patron token that I can use in future requests?

Having separate methods for getting a patron's list of loans and getting a list of their holds is a little inefficient--I'm always going to call both methods. It would save patron time to have one API method that returns both pieces of information.

leonardr commented 8 years ago

The Enki API has been updated:

Enki UserAPI.docx

leonardr commented 7 years ago

I'm going to close this issue because the evaluation has been done.