rsinger / worldcat-linkeddata-php

A PHP linked data client for WorldCat.org
MIT License
4 stars 1 forks source link

Lots of ISBNs not found #9

Open cboulanger opened 6 years ago

cboulanger commented 6 years ago

Hi, I have the strange behavior that a lot of perfectly fine ISBNs are not found.

For example: 9783593508863 certainly exists:

https://www.worldcat.org/search?q=bn%3A9783593508863&qt=advanced&dblist=638

but $manifestation->findByIsbn("9783593508863"); will result in:

Client error: `GET http://experiment.worldcat.org/entity/work/data/4617044771.jsonld` resulted in a `404 Not Found` response:
--
  | <html><head><title> - Error report</title><style><!--H1 {font-family:Tahoma,Arial,sans-serif;color:white;background-colo (truncated...)
  | '

This happens with roughly half of the ISBNs in my test data.

rsinger commented 6 years ago

Unfortunately, this is on OCLC's end: http://experiment.worldcat.org/oclc/1029546273.jsonld

See "exampleOfWork": "http://worldcat.org/entity/work/id/4617044771"

That leads to a 404. I guess it remains to be seen what sort of commitment OCLC has to maintaining their linked data. Given that their (beta) discovery API (https://www.oclc.org/developer/develop/web-services/worldcat-discovery-api/bibliographic-resource.en.html) uses basically the same data (here's an example from that documentation: https://www.oclc.org/content/dam/developer-network/web-services/WorldCatDiscovery/Bib-Book.json), maybe we can be hopeful that this is just a temporary issue.

@librarywebchic, can you provide any insight here?

rsinger commented 6 years ago

And to be clear, in your example, the ISBN is found - it's just linked to a duff work ID.

cboulanger commented 6 years ago

Thanks. Does this mean that a different link chain (and internal implementation) could lead to the desired data so that the ISBNs do not fail? Or is the work ID the only way to get to the data?

rsinger commented 6 years ago

I'm not aware of anything. It's worth noting that the other manifestation found in worldcat.org (https://www.worldcat.org/title/usurpation-und-autorisierung-konstituierende-gewalt-im-globalen-zeitalter/oclc/1029546273&referer=brief_results) points at the same, non-existent work.

librarywebchic commented 6 years ago

I'm doing some investigating trying to figure out what is going on with this.

cboulanger commented 6 years ago

Hello @librarywebchic, have you ever had time to have a look?

librarywebchic commented 6 years ago

@cboulanger @rsinger Here is what I learned.

The work URIs being published in the worldcat.org linked data are autogenerated from the work IDs in the records and not directly connected to a linked data workflow. The descriptions the URIs should point to were part of an experimental project to create work and person entity descriptions. The process was last run about 2 ½ years ago as part of the Person Entity Lookup pilot and so any resource added to WorldCat since then would not have a description generated for it. With no description to resolve to, the URI returns a 404 (page not found) error.

cboulanger commented 6 years ago

@librarywebchic Karen, thank you for your research on this issue. That's unfortunate. So there is currently no reliable way of getting the work ID by providing an ISBN. Or have I missed something?

librarywebchic commented 6 years ago

The workIDs themselves are valid. They exist in the bibliographic records. The URIs 404 because we are not creating and publishing new Work Descriptions in RDF.