UNC-Libraries / Curators-Workbench

This project has been archived and is no longer being developed or supported. The Curator's Workbench is an extensible digital collection and appraisal tool for the desktop. It is designed to acquire and process batch data efficiently while giving the user control over work flow.
http://blogs.lib.unc.edu/cdr/index.php/about/cdr-development-and-collab/curators-workbench/
24 stars 0 forks source link

Crosswalk reuse and templates (support Data Dictionary) #17

Closed jjksexton closed 12 years ago

jjksexton commented 13 years ago

I did not see a way to reuse crosswalks between projects. Many projects will use a common delimited file format, and it would save a lot of time to be able to reuse a crosswalk that was already created instead of having to make one from scratch with each project.

Creation of a data dictionary template now that the new metadata mapper is configurable and very confusing to someone that doesn't know MODS that well. A data dictionary template would pre-populate the canvas with the set of output elements in the data dictionary.

gregjan commented 13 years ago

Okay, a little thinking out loud here..

A data dictionary is a set of templates and best practices for using a metadata schema. It breaks down into a set of separate recommendations that say "record this information with this element and always like this". Depending upon the schema, the recommendations can be complex and users are required to consult the dictionary often.

I'm thinking that there's a lot of correspondence between what we might call "output templates" in the crosswalk editor and data dictionary recommendations.

A template and a data dictionary both encapsulate a particular structure of XML for a particular purpose, such as a certain way of writing the MODS name element when recording a faculty author. The recommendation in a data dictionary is generally more discursive and detailed than a "template".

I'd like to explore recreating a sort of data dictionary inside the crosswalk editor. It should be easy to use and provide a list of named templates with some documentation. Each template is a complex configuration of tags and default settings that reduce to just a few inputs. In other words, a template provides a short list of input nodes for a particular use case, hiding complexity.

For example, take a faculty author template. Inside of this template there would need to be mappings and default settings for a compound name element in MODS, exactly as you would need in the crosswalk editor now. All of that detail might be irrelevant to the user who just needs to map first and last name from a spreadsheet. They should be able to connect first and last names to "first name" and "last name" inputs at the edge of the template and not worry about the tags inside.

This ends up looking a lot like some of the UI in crosswalks 1.0, where the complexity of MODS was hidden from the everyday user. It could separate the work of maintaining the data dictionary, from the work of metadata migration.

Take a look at this image: http://www.lib.unc.edu/blogs/cdr/wp-content/uploads/2011/08/nameElement.png Now imagine that you slapped the label "Faculty Author" on top of this structure and added some named inputs on the edge, where the green lines exit. That's the sort of thing I have in mind.

gregjan commented 12 years ago

Dictionaries are an overall workbench preference. Import or link to dictionaries.

gregjan commented 12 years ago

this feature is complete and will be in release 4.0