Closed anjackson closed 7 months ago
Updated to link records back, and pull out institutions at least up to a point. Works pretty well in the browser, see this Datasette Lite view
Looking at the other years. IDEALS turned out to be easier, as I can at least grab chunks of metadata via OAI-PMH and only have to futz with the configuration of that in order to have something useful. To my surprise, OSF is proving more difficult to work with, with two different API versions that don't seem to line up well, and with it being necessary to grab a 'tree' of different files.
Now updated with some (slightly sketchy) 2022 and 2023 data in place, here.
Okay, refactored a bit and used Ed's suggested citation_pdf_url
trick to pull in the document URLs for 2023. Latest version now has separate landing page and direct document URLs. See here.
2022 data still a bit lacking, as OSF integration needs work.
Probably need to spend a little time thinking about the tables/structures. e.g.
Pretty clear, I think, that for now I'm going to have to patch up some of the metadata by hand, but I can at least get the basics in place from the repositories.
Ah, if you add a _searchmode=raw
you can do proper searches, like "this" OR "that"
. See here
Okay, so now https://www.digipres.org/publications/ hosts a web-based version and points to the DB/Datasette version.
Good enough for v1
Broadly following the Format Aggregator pattern.... Gather metadata and text from iPres proceedings. Make it easier for Google to find. Make it easy to search across. Start to think about how to link formal identifyers into the system, so we can find e.g. papers about GIFs.
This is a first iteration to demonstrate the idea, focussed on iPres proceedings.
citation_pdf_url
(idea from Ed)4
Writing up covered by digipres/registries-of-practice-project#7
Further work covered by #5