ucsdlib / dams5-cc-pilot

A repository for doing shared R&D on CurationConcerns for the Development team.
MIT License
0 stars 0 forks source link

Data Model property clarification needed #14

Closed mcritchlow closed 8 years ago

mcritchlow commented 8 years ago

Per @lsitu

We need to review each of the following highlighted properties and clarify the predicate and expected range:

https://github.com/ucsdlib/dams5-cc-pilot/blob/611f839f846b0052e21ffd08a4e05ff1eaaaadf0/app/schemas/general_schema.rb#L55-L64

I think we've settled language via discussion in #2

cc/ @ucsdlib/domm

lsitu commented 8 years ago

@ucsdlib/domm: When reviewing the new data model fro implementation, I found that the following properties may still need to be clarified:

*Language -- predicate: ::RDF::Vocab::DC.language dc:LinguisticSystem Should we restrict it to URI?

*Location -- predicate: ::RDF::Vocab::DC.spatial dpla:Place Do we still use it?

*Vessel -- predicate: r2r:vesselName, class_name: Agent What's the namespace r2r?

*isReplacedBy -- predicate: dpla:isReplacedBy Should we use dpla namespace? What is the value type?

*replaces -- predicate: dpla:replaces Should we use dpladpla namespace? What is the value type?

*rightsOverride -- predicate: pcdmrts:rightsOverride xsd:anyURI What is the namespace url for pcdmrts?

*rightsOverrideExpiration -- predicate: pcdmrts:rightsOverrideExpiration xsd:dateTime What is the namespace url for pcdmrts? Should we use xsd:dateTime or edm:TimeSpan?

*copyright_status -- predicate: ::RDF::Vocab::PREMIS.hasCopyrightStatus premis:copyrightStatus => How should we model premis:copyrightStatus?

*type -- predicate: ::RDF::Vocab::DC.type dc:DCMIType How should we model dc:DCMIType?

arwenhutt commented 8 years ago

Here's what we have so far. More to come.

*Language -- predicate: ::RDF::Vocab::DC.language dc:LinguisticSystem Should we restrict it to URI? -- yes, we've updated the range to "xsd:anyURI"

*Location -- predicate: ::RDF::Vocab::DC.spatial dpla:Place Do we still use it? -- no, we discussed it and decided to omit the location property and just use the "spatial" property for all geographic data

*Vessel -- predicate: r2r:vesselName, class_name: Agent What's the namespace r2r? -- it's rolling decks to repositories, but they don't appear to have good documentation of their schema so we decided to use a local predicate for now - ucsd:vesselName

lsitu commented 8 years ago

@arwenhutt Thanks for the update and clarification.

arwenhutt commented 8 years ago

*isReplacedBy -- predicate: dpla:isReplacedBy Should we use dpla namespace? What is the value type? -- we discussed and decided we don't currently need this relationship. It's been struck through in the data dictionary.

*replaces -- predicate: dpla:replaces Should we use dpladpla namespace? What is the value type? -- we discussed and decided we don't currently need this relationship. It's been struck through in the data dictionary.

*rightsOverride -- predicate: pcdmrts:rightsOverride xsd:anyURI What is the namespace url for pcdmrts? -- http://pcdm.org/2015/06/03/rights (I've added it to the namespaces tab)

*rightsOverrideExpiration -- predicate: pcdmrts:rightsOverrideExpiration xsd:dateTime What is the namespace url for pcdmrts? Should we use xsd:dateTime or edm:TimeSpan? -- xsd:dateTime should be fine since we can impose a date format on this (we don't want squishy dates here)

*copyright_status -- predicate: ::RDF::Vocab::PREMIS.hasCopyrightStatus premis:copyrightStatus => How should we model premis:copyrightStatus? -- it's a controlled value list http://id.loc.gov/vocabulary/preservation/copyrightStatus.html with the values: ------ http://id.loc.gov/vocabulary/preservation/copyrightStatus/cpr ------ http://id.loc.gov/vocabulary/preservation/copyrightStatus/pub ------ http://id.loc.gov/vocabulary/preservation/copyrightStatus/unk

*type -- predicate: ::RDF::Vocab::DC.type dc:DCMIType How should we model dc:DCMIType? -- it's a controlled value list http://dublincore.org/documents/dcmi-terms/#H7 BUT hold off on implementing this, I'm not sure if this is a correct range for us. We have a local type vocabulary and so we need to discuss this.

lsitu commented 8 years ago

@arwenhutt : For copyright_status, should we use URI instead?

arwenhutt commented 8 years ago

@lsitu yup. that makes sense. We do want to restrict to specific URIs though, how do we specify that?

lsitu commented 8 years ago

@arwenhutt : I think we can provide a list of values for selection. Or enforce it through value validation? It looks like other properties like "local attribution" and format /type has this issue too.

arwenhutt commented 8 years ago

Yep. And maybe others. Is there a way you would like us to indicate this in the data dictionary?

lsitu commented 8 years ago

Maybe just add those fields that has CVs to another tab like what we did for the Excel Input Stream?

arwenhutt commented 8 years ago

@remerjohnson can you work on the CVs? we need to document them (in the added spreadsheet) and look at verifying values. Related issue https://github.com/ucsdlib/dams5-cc-pilot/issues/18 Thanks!

remerjohnson commented 8 years ago

@arwenhutt Okay. I'll put it on the CV tab and mirror the Excel Input Stream.

arwenhutt commented 8 years ago

Thanks!

remerjohnson commented 8 years ago

A note that Chrissy/UCSB will use the URIs from LoC on this: LoC Resource Types

Not sure if we want to even look into that or stick with what we've got.

ghost commented 8 years ago

Just clarifying that dc:type is referring to the type of the original resource ("Source resource" as DPLA says), not the digital surrogate, right. If so, then do we need both type and genre? Only if we want to use both the limited range of DCMIType like "still image" and a more specific genre, like "Postcards".

arwenhutt commented 8 years ago

@GregReser That aligns with our current use of type of resource and genre, and what I was thinking.

@remerjohnson Do they align with our current values? including the weird ones (notated movement), it may do since I think they were derived from RDA values.

ghost commented 8 years ago

No, we would have to choose a different range. DC doesn't limit type to DCMIType. Could it be anyURI? For genre, DPLA used edm:hasType which, from their definition, seems like a good fit.

http://pro.europeana.eu/files/Europeana_Professional/Share_your_data/Technical_requirements/EDM_Documentation//EDM_Definition_v5.2.7_042016.pdf

remerjohnson commented 8 years ago

@arwenhutt RE: resource type alignment. Greg and I looked at it and yes, in general they do, although not sure why they abbreviate the types in the URIs. e.g. still image at a glance is http://id.loc.gov/vocabulary/resourceTypes/img.html. I'll have to investigate the exact alignment more.

RE:genre, I think that it may not be possible/desirable to use an existing vocab for genre, which means we could set up a local vocab and/or point to FAST genre URIs where appropriate, e.g. 'Postcards' is http://id.worldcat.org/fast/1726707

arwenhutt commented 8 years ago

Genre will behave like other subjects right? So external vocabs when they work, local ones when they don't?

remerjohnson commented 8 years ago

@arwenhutt That makes sense to me :+1:

arwenhutt commented 8 years ago

And I missed Greg's response, yeah I think it would be like some of the other values with CVs, a type of anyURI with a reference to our controlled list of "approved" values (documented on the CV tab).

ghost commented 8 years ago

Sounds good. Should we replace genreForm (row 116, col P) with genre (row 127, col P)?

arwenhutt commented 8 years ago

@ucsdlib/domm just discussed and we scratched genre (row 127, col P).

remerjohnson commented 8 years ago

A quick alignment review between our resource types and LoC's:

Our label LoC label LoC URI
cartographic Cartographic http://id.loc.gov/vocabulary/resourceTypes/car
data Dataset http://id.loc.gov/vocabulary/resourceTypes/dat
mixed material Mixed material http://id.loc.gov/vocabulary/resourceTypes/mix
moving image Moving image http://id.loc.gov/vocabulary/resourceTypes/mov
multimedia Multimedia http://id.loc.gov/vocabulary/resourceTypes/mul
notated movement Notated movement http://id.loc.gov/vocabulary/resourceTypes/nmv
notated music Notated music http://id.loc.gov/vocabulary/resourceTypes/not
software Software http://id.loc.gov/vocabulary/resourceTypes/sof
sound recording Audio* http://id.loc.gov/vocabulary/resourceTypes/aud
sound recording-musical Audio musical* http://id.loc.gov/vocabulary/resourceTypes/aum
sound recording-nonmusical Audio non-musical* http://id.loc.gov/vocabulary/resourceTypes/aun
still image Still image http://id.loc.gov/vocabulary/resourceTypes/img
text Text http://id.loc.gov/vocabulary/resourceTypes/txt
three dimensional object Artifact* http://id.loc.gov/vocabulary/resourceTypes/art

"Audio" has variant label "Sound Recording" "Audio musical" has variant label "Sound recording - musical" and "Audio music" "Audio non-musical" has variant labels "Sound recording - nonmusical" and "audio spoken" "Artifact" has variant labels "3-dimensional object" and "Physical artifact"

remerjohnson commented 8 years ago

In light of above, is it safe to say we can specify range of 'type' to be xsd:anyURI, then we add the above URIs to the CVs tab? Don't mean to being up dreaded Concept again, but just noting these are all instances of skos:Concept (and this is the range Chrissy/UCSB have)

ghost commented 8 years ago

What about notated music? If D'MPS is OK with removing it, we could go with the LoC URI and skos:Concept

remerjohnson commented 8 years ago

@GregReser Ah! Knew I missed one :smile:
It's http://id.loc.gov/vocabulary/resourceTypes/not.html

ghost commented 8 years ago

That will work and skos:Concept is a good range

mcritchlow commented 8 years ago

@lsitu and @ucsdlib/domm - could we close this ticket and create a new one for any remaining work? A lot got done here, but I don't want to lose any next steps. Please advise.

lsitu commented 8 years ago

@mcritchlow: Yes, I believe it's very close to be done already. Thank you all. Should we close this ticket as Matt suggested @ucsdlib/domm?

mcritchlow commented 8 years ago

Marking as closed per discussion in Review mtg.