Closed jrochkind closed 1 year ago
Draft public doc intro for API availability. Not sure where it will go. Re-written from FAQ "Can I freely download or extract your metadata?". Being edited here for now.
We strive to make our open data freely available, but the options we provide for machine-readable metadata access currently consist of somewhat limited and disparate services. If you have a project that could benefit from more convenient or standardized machine-accessible APIs for metadata access, please get in touch to share your use case.
We have an OAI-PMH feed which can give access to our metadata in an XML format. The fields are based on the OAI-DC schema, with extensions suggested by the DPLA metadata application profile, as this feed's main use case is DPLA use.
This metadata includes standardized basic descriptive attributes, but does not include all internal, administrative, and relational metadata.
You can bulk harvest via an OAI-PMH 2.0 endpoint at https://digital.sciencehistory.org/oai
You can also get an oai-dc XML representation for any record by adding .xml
to the end of a record's URL. For instance, https://digital.sciencehistory.org/works/vt150j62m.xml.
Any search result is available in the Atom Syndication Format. Just add .atom
to the path of any search results, for instance:
https://digital.sciencehistory.org/catalog.atom?q=chemistry
instead of:
https://digital.sciencehistory.org/catalog?q=chemistry
You can also access atom search results within any collection. This includes listing all items in a collection. For instance, for the Oral History Collection: https://digital.sciencehistory.org/collections/gt54kn818.atom
Or with a query:
https://digital.sciencehistory.org/collections/gt54kn818.atom?q=biomedicine
For every "work", you can access metadata in an XML/OAI-DC format, or a local internal JSON format.
The OAI-DC format is a standardized vocabulary (based on DPLA metadata application profile), and should hopefully be fairly stable. However, it includes only a subset of our metadata. E.g.: https://digital.sciencehistory.org/works/46k32ki.xml
The JSON format is a closer representation of our internal metadata, and includes a larger subset of all metadata. However, while we will endeavor to keep it stable, it is more likely to change as a result of internal software changes. E.g.: https://digital.sciencehistory.org/works/46k32ki.json
At present we do not have an API response that will give access to individual files (for instance page images or audio files).
@eddierubeiz when you have a chance, could you give my API docs draft above a look and feedback?
A couple notes:
Atom Feeds
: "Just substitute catalog.atom
for catalog
in any search result URL: for instance, https://digital.sciencehistory.org/catalog.atom?q=chemistry
instead of https://digital.sciencehistory.org/catalog?q=chemistry
.
As a specific use-case for existing #201, we now want to focus on the Max Planck oral history project, create an API that they can use to get our oral history metadata.
Treat them as a use case/user in making requirements and spec'ing out what we have (although it would ideally be available for all items not just OH, for #201).