Closed tpendragon closed 9 years ago
@sseymore @kestlund @wickr Going to need help on the metadata questions above.
@terrellt We have a method for photographer for both local vocab and LC. See our mappings_uo.yml.
I'll look into the other ones.
@terrellt
There is dct:isReferencedBy for refere if that works
For origin, we could create a formerID predicate or use:
vra:idFormerAccession rdf:type rdf:Property ; rdfs:label "former accession ID"@en ; rdfs:domain vra:Record ; rdfs:range rdfs:Literal ;
For compou, this looks like the value is in the wrong field..? Can this field be renamed to album? We used dct:isPartof for album in the Doris Ullman photo collection. Or, it can be merged with relate because those are similar isPartOf values except for the "glass negatives" one. Can that be cleaned up and merged?
We have a method for photographer for both local vocab and LC. See our mappings_uo.yml.
I don't see the actual ruby code for these, just reference to the method.
There is dct:isReferencedBy for refere if that works
@wickr @mlv611 Is this good?
Could you both look into @sseymore's comments about origin too?
For compou, this looks like the value is in the wrong field..? Can this field be renamed to album? We used dct:isPartof for album in the Doris Ullman photo collection. Or, it can be merged with relate because those are similar isPartOf values except for the "glass negatives" one. Can that be cleaned up and merged?
Removed the bad glass negatives thing in desc.all, I'll use dct:isPartOf for both fields.
@terrellt Linda is sending you the ruby code.
@terrellt Looks like <origin>
is Original Photographic Number. Sometimes it's part of the full ID, sometimes it's not. So another identifier. I'm not sure it's entirely 'former' so I don't think 'formerID' would be good. I think it's more that a photo came in already having that identifier.
@terrellt @sseymore's notes about <origin>
work for me
<refere>
I think is more accurately something like 'Appears In' as in a printed title, but dct:isReferencedBy is close enough for me.
<compou>
was set to hidden from public view, as was <relate>
@wickr Should I just not ingest the metadata in compou and relate?
@sseymore We have some "work types" that are more specific than what's in Getty - like "5x7 glass negatives" instead of "glass negatives". What do you guys usually do with those?
@terrellt I'm hesitant to say don't include something, but the compou and relate values can go with dct:isPartOf
@terrellt what field is that? We have cleaned up the data for things like that to make it conform to a CV.
@sseymore
Right now
<title>Portrait of three women -- Myrtle Gifford, sister, and mother?</title>
<digita>Gifford Photographic Collection</digita>
<creato>Gifford, Benjamin A.;</creato>
<date>circa 1885-1919</date>
<covera></covera>
<descri>Unidentified images that are likely of Gifford family members.</descri>
<subjec>Portrait photographs;</subjec>
<publis></publis>
<contri></contri>
<relati>Gifford Photographic Collection</relati>
<refere></refere>
<identi>P218 SG1 30 12</identi>
<origin></origin>
<type>Image</type>
<format>5x7 glass negatives;</format>
<source>Glass negatives;</source>
<other></other>
<rights>Permission to use must be obtained from OSU Special Collections and Archives Research Center.</rights>
<transm>Master scanned with Epson 10000XL scanner with Silver Fast 8.0.1 r18 (Dec. 5 2012) e75cb1f05.12 scanning software at 750ppi. No image manipulated.</transm>
<file>P_218_SG_1_30_12.tif</file>
<status>Cataloged</status>
<compou></compou>
<relate></relate>
<fullrs>Gifford1/P_218_SG_1_30_12.tif</fullrs>
<find>15.jp2</find>
<dmaccess></dmaccess>
<dmimage></dmimage>
<dmad1></dmad1>
<dmad2></dmad2>
<dmoclcno></dmoclcno>
<dmcreated>2013-03-12</dmcreated>
<dmmodified>2013-03-20</dmmodified>
<dmrecord>6</dmrecord>
becomes
<http://example.org/ns/6> <http://purl.org/dc/terms/title> "Portrait of three women -- Myrtle Gifford, sister, and mother?" .
<http://example.org/ns/6> <http://id.loc.gov/vocabulary/relators/pht> <http://id.loc.gov/authorities/names/n92004880> .
<http://example.org/ns/6> <http://purl.org/dc/terms/date> "circa 1885-1919" .
<http://example.org/ns/6> <http://purl.org/dc/terms/description> "Unidentified images that are likely of Gifford family members." .
<http://example.org/ns/6> <http://purl.org/dc/elements/1.1/subject> "Portrait photographs" .
<http://example.org/ns/6> <http://purl.org/dc/terms/identifier> "P218 SG1 30 12" .
<http://example.org/ns/6> <http://purl.org/dc/terms/type> <http://purl.org/dc/dcmitype/Image> .
<http://example.org/ns/6> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> "5x7 glass negatives" .
<http://example.org/ns/6> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> "Glass negatives" .
<http://example.org/ns/6> <http://purl.org/dc/terms/rights> <http://www.europeana.eu/rights/rr-r/> .
<http://example.org/ns/6> <http://opaquenamespace.org/rights/rightsHolder> "OSU Special Collections & Archives Research Center" .
<http://example.org/ns/6> <http://opaquenamespace.org/ns/conversionSpecifications> "Master scanned with Epson 10000XL scanner with Silver Fast 8.0.1 r18 (Dec. 5 2012) e75cb1f05.12 scanning software at 750ppi. No image manipulated." .
<http://example.org/ns/6> <http://www.loc.gov/standards/mods/modsrdf/v1/note> "Cataloged" .
<http://example.org/ns/6> <http://opaquenamespace.org/ns/full> "Gifford1/P_218_SG_1_30_12.tif" .
<http://example.org/ns/6> <http://www.loc.gov/premis/rdf/v1#hasOriginalName> "15.jp2" .
<http://example.org/ns/6> <http://purl.org/dc/terms/created> "2013-03-12" .
<http://example.org/ns/6> <http://purl.org/dc/terms/modified> "2013-03-20" .
<http://example.org/ns/6> <http://purl.org/dc/terms/replaces> <http://oregondigital.org/u?/gifford,6> .
<http://example.org/ns/6> <http://opaquenamespace.org/ns/set> <http://oregondigital.org/resource/oregondigital:gifford> .
@terrellt
Apologies of the delay. We have used hasFormat in our collections for fields that list the source format like black and white negative for instance. So I think that's a good fit for Dimension Format (format).
For the source and other fields, which are filled with other CVs, could they be made into subject fields?? It's a lot of random terms and I'm not sure what you guys can do with them.
We also have used hasVersion for other digital file formats, but I'm not sure if this works for you since the values are mixed.
@sseymore source/other seem pretty clearly to be something that should come from Getty, and thus RDF.type, no? I'll use hasFormat for
@terrellt can you send me the desc.all file please?
@terrellt found the desc.all file. I see the values now. Most of them will be in AAT, so Julia suggests we use vra:workType.
@sseymore Every other collection we've imported has used DC.type - was that wrong?
@terrellt Hmm, well for dct: type, it should be the dcmi type vocab http://dublincore.org/documents/2000/07/11/dcmi-type-vocabulary/
In DC- "To describe the file format, physical medium, or dimensions of the resource, use the Format element." These 3 fields are describing the source format, so I would go with format elements.
@kestlund @jsimic please correct me if I'm wrong.
@sseymore http://dublincore.org/documents/dcmi-terms/#terms-type Says to use -A- controlled vocab, and the range seems to be "any rdf thing"
@terrellt , @sseymore , @jsimic : we had specified that dcmi-type be the vocab used with dc.type and to use a more appropriate type category if available like the vra.worktype.
See metadata dictionary for OD: https://docs.google.com/document/d/1pudn5bDMikQ0xlNv6cEnscDkakCyeRjPwVkFNNVp0JM/edit#heading=h.4xjto5aqul0
We typically split things like 5X7 glass plate negative and put the 5X7 in a dimension field or description.
I would like to keep dc.type with dcmi type if possible, but let me know if not.
@kestlund Data dictionary says RDF.type for work type, which is what we've been using. DC.type for DCMI type, so I'll follow that.
Great. Yes, do that. Sorry for the confusion.
On Mon, Sep 8, 2014 at 8:39 AM, Trey Terrell notifications@github.com wrote:
@kestlund https://github.com/kestlund Data dictionary says RDF.type for work type, which is what we've been using. DC.type for DCMI type, so I'll follow that.
— Reply to this email directly or view it on GitHub https://github.com/OregonDigital/oregondigital/issues/551#issuecomment-54839450 .
Just need to get this reviewed.
Sent to Larry to get this spot checked.
Is 'printouts' supposed be showing up as a Work Type? Just curious.
thanks @jsimic
@jsimic @terrellt I think the second work type (printouts) should be removed.
@mlv611 @tvc15brian : Completely up to you guys.
@wickr @terrellt @mickeroo we're about to start a metadata cleanup process with this collection. We have about 35 items waiting for the local workType term of "Glass positives", which I added to worktype.jsonld a couple months ago. How long does it take for new Opaque Namespace Vocabulary Terms to make it into Oregon Digital?
Until I run the code to re-fetch things from github. I'll start it.
@tvc15brian It's in there and autocompleting now.
Thanks! @terrellt
Update from SCARC meeting 6/18/15:
Before Reviewing this collection:
That will be enough to Review, and then we can continue Topic and Work Type cleanup.
I will likely fix the work type URIs sooner rather than later, because you can't edit an item until they are corrected (Term not in Controlled Vocabularies error).
WorkTypes are mostly cleaned up:
Unsure about:
Photographers URIs cleanup should be finished. Any that aren't listed or faceting at this point I don't believe were in CONTENTdm to begin with, but let me know if I missed anything. Summary:
I also started on the Location cleanup, mapping text to Geonames URIs. There's 274 fields that are not URIs.
@wickr I tried fixing some of the fields today before I left. I will finish fixing the rest on Monday.
@Clarkeri which fields do you mean? Location? I was going to bulk change most of those.
@wickr I changed a few of the location fields. I'm glad you can just do a bulk change. Thanks!
Thanks Erin, apparently you changed almost all of them. I fixed the few remaining ones. Then I ran a script to fix older Geonames URIs that didn't have a slash, and then I ran another script to basically reindex the Geonames labels, so all of the Location/Region facets should be correct and clean.
I also fixed some of the missing images today but there's still lots more.
I fixed about 50 missing images, and I'm pretty sure I got them all.
I want to do some quick Subject cleanup but after that this should be good enough to Review.
Bulk changes for Subjects are done. There were 1500+ text strings, with lots of repeats. Most were in TGM, a handful were in LCSH/LCNAF. There's about 100 left.
Down to 10 unique text strings for Subject:
Extension (2x) Face Rock Gifford, house of (20x) Kueny, Mary (5x) Nygren, Gene Poling Hall Reiling, Norman (4x) Runnion, Kenneth Steiner, John Weatherford Hall
Everything is reviewed and live. The remaining subject cleanup was moved to new content repo: osulp/oregondigital-content#1