american-art / aaa

Archives of American Art
Creative Commons Zero v1.0 Universal
1 stars 0 forks source link

map GeneralFormat and SpecificFormat #10

Closed VladimirAlexiev closed 6 years ago

VladimirAlexiev commented 7 years ago

GeneralFormat and SpecificFormat are very useful classifications, please emit them. There are 11 GeneralFormat and 135 SpecificFormat. Since these are not mapped to AAT, make them out as a local 2-level thesaurus.

Take eg http://data.americanartcollaborative.org/page/aaa/object/1639 that should have type "Artworks" and subtype "cartoon (humorous image)". @workergnome @azaroth42 do you agree with this mapping:

<aaa/object/nnn>
  crm:P2_has_type
    <aaa/thesaurus/object/artworks>,
    <aaa/thesaurus/object/cartoon_humorous_image>.

<aaa/thesaurus/object/artworks> a skos:Concept;
  skos:inScheme <aaa/thesaurus/object/>;
  skos:prefLabel "Artworks".

<aaa/thesaurus/object/cartoon_humorous_image> a skos:Concept;
  skos:inScheme <aaa/thesaurus/object/>;
  skos:broader <aaa/thesaurus/object/artworks>;
  skos:prefLabel "cartoon (humorous image)".
tobiashreiter commented 7 years ago

@VladimirAlexiev , these have been mapped to AAT terms. We have extra spreadsheets in the data-v3 folder, one for general formats, and for specific formats, that links our local term to the AAT code and term.

VladimirAlexiev commented 7 years ago
tobiashreiter commented 7 years ago
VladimirAlexiev commented 7 years ago

Weirdness: I see them in https://github.com/american-art/aaa/tree/master/data-v3. But when I Pull or Fetch (or even Revert), I don't see them in my working directory.

Same for Item_DigitalResources.xls:

VladimirAlexiev commented 7 years ago

Of course, it's better to emit AAT concepts. I propose this changed mapping:

<aaa/object/nnn> crm:P2_has_type aat:300133025, aat:300123430.

aat:300133025 a skos:Concept;
  skos:prefLabel "Artworks".

aat:300123430 a skos:Concept;
  skos:broader aat:300133025;
  skos:prefLabel "cartoon (humorous image)".

It would still be best to emit skos:broader according to AAA's 2-level classification. I've extracted it from objects (Item.xls) like this:

>csvcut -t -e cp1251 -c GeneralFormat,SpecificFormat Item.txt|sort|uniq -c>Item-Format.txt

Item-Format.txt @tobiashreiter can you add such column "parent" to AAA Specific Format Mapping To AAT.xlsx

tobiashreiter commented 7 years ago

@VladimirAlexiev , just to be very clear, are you simply asking that the general format be added to the specific format spreadsheet in the first column? If so, would you prefer our internal (AAA) vocabulary for general format, or the AAT term?

tobiashreiter commented 7 years ago

@VladimirAlexiev : I've added the general format (using our internal vocabulary) to the Specific Format Mapping sheet.

VladimirAlexiev commented 7 years ago

AAT is preferred for the extra col, but ISI can do that mapping

tobiashreiter commented 7 years ago

Great. And, as a reminder, that mapping is available in the General Formats spreadsheet.

tobiashreiter commented 6 years ago

Mapped, with local terms replaced with AAT terms (where available). Using AAT "document genre" for General Format, and "format" for Specific Format.