ualbertalib / HydraNorth

This repo is deprecated. Succeeded by https://github.com/ualbertalib/jupiter. This codebase was a IR built based on Samvera/Sufia
11 stars 4 forks source link

Dataverse to HydraNorth #67

Closed pbinkley closed 9 years ago

pbinkley commented 9 years ago

As a data librarian, I want to be able to ingest item-level metadata from Dataverse into the DAMS, so that Dataverse items are discoverable via the HydraNorth discovery layer while they are still accessed in Dataverse.

Part of epic #69

pbinkley commented 9 years ago

This may turn out to be too difficult to achieve without adding new models, but let's have a look and see if it can be done in this sprint. Out of scope: direct deposit of items without datastreams In scope:

Next steps:

pbinkley commented 9 years ago

[Sharon:] Earlier discussion had concluded that we would allow the deposit of datasets into Hydra (ERA). If this is to be changed then the form will need to be adjusted accordingly and any datasets currently in ERA assessed for handling.

johnhuck commented 9 years ago

The product owner has determined that, for purposes of the September launch, the scope for this story pertains only to basic data discoverability, such as can be achieved by importing OAI_DC into HydraNorth from Dataverse.

Chuck and Larry will develop more detailed user stories around data search and discovery in Hydra to inform work on phase 2 (post-September) of this user story, which will revisit the question of whether or not to import DDI-RDF. We will look for gaps between these more granular user stories and the functionality of the DC imported in phase 1.

johnhuck commented 9 years ago

Sent a first draft of transformed oai_dc metadata sample to Weiwei on June 19th. The transformed metadata: -updates the namespaces of the dc: elements to dcterms: -maps values in dc:date to dcterms:created -maps values in dc:coverage to either dcterms:spatial or dcterms:temporal -maps dc:type (which contains free text values) to dcterms:description -assigns <dcterms:type>Dataset<dcterms:type> to all datasets to conform to established type values in HydraNorth -excludes dc:description elements that contain a formatted citation

Questions that remain:

Other issues: At least one of the oai_dc records (10205) wouldn't resolve with its DOI. The metadata was very thin, which makes me wonder if it is an unreleased study or was withdrawn. When we start to work with a more current metadata extract, we should check to see if this issue resolves itself.

weiweishi commented 9 years ago

Migration script is in dataverse-migration branch for demoing, and reviewing the work. related questions/issues #452 #451 #450 #449 #448. objects migrated are in Plano.

Will leave it to @pbinkley and @johnhuck to decide if this ticket can be closed.

pgwillia commented 9 years ago
pbinkley commented 9 years ago

Sonya and John will chase down answers to outstanding questions and we'll come up with stories to reflect the remaining work. Tricia will do a PR of the current state of this story, and we'll close it.