Closed steads closed 8 years ago
The linguistic mapping looks expensive. To take the first object... charcoal and chalk on paper
has three separate concepts. However to create a new type for the combination seems overly specific. What about just recording the string as a string, acknowledging that it could be broken apart in the future, without minting new URIs for all of the different free text values?
If even E55 is to expensive for you then use E62 String and lose all your semantics.
I provided indexed data for both medium support in the ObjThesTerms table of my updated data set.
@si-npg: is the Medium field fully covered bu ObjThesTerms and therefore should be skipped?
I'm assuming that the answer to Vladimir's question is yes. So I mapped the medium field in NPGObjThesTerms, and I'm going to ignore it in the NPGObject model. Let me know if this is an incorrect assumption!
@rhao can you give an example in turtle so we can check it?
See 87179, 87284, 85694 and 87054 as examples where ObjThesTerms differ from or are not as descriptive as the Medium field. And a few don't even have any ThesTerms, e.g. 38463. Sometimes a list of mediums and a support doesn't tell the whole story. We have additional ThesTermTypes for "Process" and "Other Classification", but without a human-written display field, the medium/support might sometimes be unclear.
Thanks for the examples for why the Medium field are still necessary. I changed the medium class from E57_Material to E55_Type to make it more general.
I think P2 is wrong. If you have a drawing
made with charcoal and chalk on paper
, the former is the type and the latter is a material/medium/technique.
If you encounter a field Shape
(eg CONA has such), would you also map it to P2?
So ... it turns out (after implementation) that the linguistic mapping for some object types is pretty easy. In particular Paintings and Drawings can be reasonably accurately mapped with regular expressions by separating the materials from supports.
For the example charcoal and chalk on paper
, the E22 has two E57s: Charcoal and Chalk.
It consists_of an E18 with P2 of Support, that has an E57 of Paper. This also gives us the E18 to associate dimensions with (consider framed vs unframed).
Decorative Arts on the other hand are almost impossible to distinguish what is happening in the "medium" field ... it combines parts, shapes, colors, materials and general descriptive text. These could potentially be done by keyword analysis and natural language processing into distinct materials, but the relationship between the materials and parts would require sophistication beyond the state of the art.
Medium: Fields like this are a perpetual problem. It is a mishmash of many different things lumped together into one field. It is a mixture of technique, mostly general but some specifics, and material. To deal with it completely would require a full analysis of each different unique entry string so that they can be divided into those elements that are E29 Design or Procedure (for specific documented techniques), E55 Type (for general techniques) and E57 Materials. Also the relationship needs to be established so that the correct property is used to connect things. Materials that become part of the object use P45 whereas materials that are just used during modification/production would use P126 employed (was employed in). If the resources are available to do this analysis then great, if not then treat it as a simple E55 Type. E22 -> P2 ->E55 (Medium).