national-gallery / NG-CIIM

Development of Gallery-instigated CIIM configurations and plugins; not the Gallery's CIIM itself.
0 stars 0 forks source link

Objects - general feedback #24

Open RGShepherd opened 11 months ago

RGShepherd commented 11 months ago
  1. identified_by: in the accession no. block, please could we put the value of the Elasticsearch 'display number' - still called 'accession number' in Linked Art?
  2. For our texts, please could we distinguish between brief texts, https://vocab.getty.edu/aat/300435416, and full texts, http://vocab.getty.edu/aat/300418050? (Though the scope note for the latter is not helpful. If this seems the wrong label, then can we add an additional, purely textual one? Though I think AAT may be missing something along the lines of 'online description' as a form of text.)
  3. Copyright/License Statement: for various reasons, not necessarily good - please can we just drop this block?
  4. member_of: empty in object-1540 - how does this differ from part_of, which is populated?
  5. And what's the distinction between carries (empty) and shows (populated)?

Sorry - writing this on the train, so not always able to double-check the Linked Art spec.

richardofsussex commented 10 months ago
  1. you just want it called 'display number'? It's already in there
  2. I think both of your texts - short and long - are https://vocab.getty.edu/aat/300435416. I don't think http://vocab.getty.edu/aat/300418050 (or its cousin http://vocab.getty.edu/aat/300418049) is particularly expressive of the distinction you want to make, so I would prefer to just put in a textual sub-classification
  3. dropped
  4. member_of is for objects in a package (i.e. members of a set). Now tested for more specifically, and so shouldn't appear within an object record
  5. carries is for inscriptions and other textual things which appear on the object. Shows is for visual depiction. Which object has a 'carries' statement?
RGShepherd commented 10 months ago

@richardofsussex , thanks!

  1. Not quite. We need to test for the presence of a display number, and if it is there we ignore the value of the accession number. Instead, we use the display number, but we treat it as if it were an accession number, i.e. it is classified_as an accession number. We suppress the display number block entirely. In short: If there's a display number, its value replaces that of the accession number.
  2. OK.
  3. Resolved - thanks!
  4. No, sorry, if that's what it's used for - we do need member_of in the object record - but happy for it to be suppressed if empty.
  5. Any of object-1686, object-1755, object-5201, object-18535, object-18788 - look in the inscription block in the ES.
RGShepherd commented 10 months ago

That was object-1540; moving on to object-1550.

  1. Our ongoing _label issue: please remove the classification component.
  2. classified_as for school: equally, one could argue that this belongs under shows, as per https://linked.art/model/object/aboutness/#style-classification, because it's primarily a stylistic statement.
  3. classified_as for department: strictly speaking, departments should be listed under custody (https://linked.art/model/object/ownership/#custody) but that's not quite how we use it here. Perhaps we could simply record this as http://vocab.getty.edu/aat/300438433, 'status notes'?
  4. classified_as for genre: probably belongs in shows, as per https://linked.art/model/object/aboutness/#other-classifications?
  5. other types of classified_as - physical form, function, legal status: I can't find places for these elsewhere in the model, so I think fine as they are, unless anyone else has suggestions?
  6. Provenance statements: I'd suppress this block unless there's content in provenance.text.value in the ES
  7. shows for subject matter: tricky, because we don't make the Linked Art distinction between depiction and subject; on balance, I'd probably treat everything as a subject (i.e. shows.about, rather than shows.classified_as) for now, because actual depictions will in due course be managed by links to people and places.
RGShepherd commented 10 months ago

And object-1686 ...

  1. There's a difference between bibliography.@link.details.type, bibliography.@link.details.page, and bibliography.@link.details.note in the ES, so each probably needs its own referred_to_by block. page relates to http://vocab.getty.edu/aat/300445022, note is still http://vocab.getty.edu/aat/300435415, and type is probably closest to http://vocab.getty.edu/aat/300379665.
RGShepherd commented 10 months ago

object-1755:

  1. I've asked for access.item.lending to be removed from the ES; but for safety's sake, @richardofsussex, please could you remove it in the XSLT as well?
RGShepherd commented 10 months ago
  1. We need to add a little more data into produced.by.carried_out_by, I think, as what we have in the data is a series of previous attributions, and this is not reflected in the JSON. The key datum is creation.maker.@link.role.value; possible values are Artist and Previous Attribution. Here, I think we're in the realm of assertions: https://linked.art/model/assertion/#assignment-of-attributes. (object-3520)
  2. Which leads me to qualified attributions, as contained in creation.maker.@link.prefix and creation.maker.@link.suffix; this looks like a case of https://linked.art/model/assertion/#style-of-attribution. (object-1550; but note that here it relates to 'workshop of', which Linked Art, being pedantic, counts as an entity in its own right; really, does anyone record their data with this level of semantic precision?)
richardofsussex commented 10 months ago
  1. When you talk about "accession number" in your ES data, do you mean "object number"?
jpadfield commented 10 months ago

Which is the best object to work from?

  1. Also I was wondering where the pattern classified_as - Classification came from - it seems to just over complicate things? Is it the standard method of linked "other" metadata?
jpadfield commented 10 months ago

In Object 1540

  1. the unit is missing from the weight dimension
jpadfield commented 10 months ago

In Object 1540

*the unit is missing from the weight dimension

Also:

  1. The made_of statement does not quite work:
"made_of": [
{
"type": "Material",
"content": "Oil on wood"
}
],

The painting is "made_of" Oil and Wood not "Oil on Wood" - LA has this as A ''Material Statement' instead.

  1. We have an empty "part" term.
jpadfield commented 10 months ago
  1. It could be good to have an equivalent PID for the National Gallery - such as - https://ror.org/043kfff89
  2. What data will we actually have in relation to the NG PID for the National Gallery?
  3. Current location - It will also be interesting to see what details we have for individual rooms such as "Gallery 25" - As the data is going to be used externally would it be good to have more than one label for NG rooms? I was thinking the simple names we have and then more compete names such as "Gallery 25 of The National Gallery", probably not needed ... it will depend of how future systems might display labels relating to locations within multiple Museums ....
jpadfield commented 10 months ago
  1. Looking at the output again it would seem that several of the classified_as values (school & genre) might work better as "shows" values - just to be consistent ...
  2. The "Picture" classification could probably be a simple link to the aat PID
  3. The "department" thing - might that make more sense as the painting being a member of a group? It might make it clearer for external users ... objects may well belong to a lot of groups going forward - "Objects on display", "Lined paintings", "X-rayed Paintings", etc .... I do not have a specific use case here, just thinking that as we are starting to classify paintings for example in relation to environmental conditions groups might be a nice way to go, as we can then have nice scope notes for the groups .....
  4. Not sure about the "physical characteristics" bit they could be as is or simple links to AAT again
  5. The "Accessioned object" bit - how do you see this being used? It could again be done with group or just via the current owner - but we own accessioned and non-accessioned works, so I'm not sure - would be good to see how it would be used.
RGShepherd commented 10 months ago
  1. When you talk about "accession number" in your ES data, do you mean "object number"?

I do - sorry!

RGShepherd commented 10 months ago
  1. Yes, comparing our classified_as block with https://linked.art/example/object/21, I think we nest one level too far?
  2. Agreed - the data is there in the ES, in measurements.dimensions.units; I suspect it needs additional coding to map to the relevant AAT term?
  3. Agreed. In fact, we will be delivering both; @richardofsussex, please could you code so that material.value goes to a materials statement, but where we have links (i.e. we have material.@admin.uid, the data goes into made_of? E.g. object-4668.
  4. Agreed. @richardofsussex , please could you drop empty blocks? I think it's tidier to do so, and will help people reusing the data.
  5. Alternative identifiers belong on the organisation record for the NG, agent-652. We map to ULAN and Wikidata by default, and those will be in the CIIM data on the next extraction.
  6. The standard organisation record derived from the CIIM - see agent-652.

Have to move onto something else now, but @richardofsussex , I hope this gives you something to be getting on with.

richardofsussex commented 10 months ago
  1. The Linked Art guidelines say

    The style is associated with the object using the classified_as property, and must be a reference to an appropriate vocabulary.

What we have are strings like "Italian (Florentine)". Shall I bung them in anyway as the 'outer' classified_as within "shows", and sub-classify them as School in the approved LA manner? Or put them as a simple _label, with a one-level classification as "School"?

richardofsussex commented 10 months ago

Looking back over my conversation with the Linked Art crowd, I found this (from Rob S.) from 6 April:

I agree ... classified_as is the right approach. If you can reconcile with AAT, then that would be great for cross-institution interoperability. If you can't, then minting your own dereferencable ID is great. If you can't, then I agree with David that just putting in a URI that doesn't return anything but with a _label is a fine stopgap until one of the previous can be done.

This suggests the first of my two strategies above, with "Italian (Florentine)" going in as a label. Rob recommends adding a URL which doesn't resolve to anything, so we could do e.g. https://data.ng.ac.uk/school/Italian(Florentine). What do we think? If you look at the last classified_as entry in this Getty Museum record: https://data.getty.edu/museum/collection/object/c88b3df0-de91-4f5b-a9ef-7b2b9a6d8abb you'll see this approach being used.

(This is a general issue, which has cropped up before and will crop up again. It would be good to get agreement on how we tackle it. And apologies if you think I'm going over ground we have already covered!)

richardofsussex commented 10 months ago

Which gives us this: image See http://richardofsussex.me.uk/ng/ciim7-output/object-1550.json for a complete 'shows' section which I think implements this approach. (Note that the 'subject' block now has simple one-level URLs, since the context of 'shows' doesn't require further qualification.)

RGShepherd commented 10 months ago

@richardofsussex , your work on 7 looks good to me - many thanks!

RGShepherd commented 10 months ago
  1. Current location: this is managed using the distinction between custodian (NG) and location (individual rooms) in Linked Art, and I see no reason to create data to elide it. Happy with data as it is.
  2. This is 7.
  3. We have other classifications, too, and no way of storing an external PID for them unless we hard-code them mapping in these transformation scripts, which ha s maintenance overhead.
  4. This is 8. Happy with my proposal there for now.
  5. Physical form: these will be links to AAT when the Gallery gives me the resources to sort out our thesaurus / classifications. Otherwise, they're fine as they are.
RGShepherd commented 10 months ago
  1. There's already a distinction between ownership and custodianship in the model, so the classification by legal status seems fine for now. N.b. legal statusses overlap with Departments - so we have accessioned objects in both the main collection (subject to the Act) and history/contextual collection (not subject to the Act).
jpadfield commented 10 months ago

Is there a separate issue or document somewhere that discusses the logic or forming our vocabulary term URLs, such as: https://data.ng.ac.uk/school/Italian_(Florentine).

Would it not be better to follow the same structure as references to the AAT- PID based URL, type and label?

RGShepherd commented 10 months ago

This data originates in TMS, where it is stored as free text against each object with no scope for mapping to external identifiers. It's just one of a long list of things we could improve, but wasn't seen as a priority for DDP, and I don't have the resource to change what we do, retrain TMS users accordingly, update all our guidance, update all the database views and reports that use the data, and remap our CIIM data. So for now, for this and many other things, we will work with the data we have. As far as I'm concerned, given that these URLs are effectively meaningless, we can just generate them in the manner Richard's proposed. There's a question for the Ljnked Art community about the desirability of mandating the creation of decidedly un-cool URIs, but for now, that's what they recommend, so we'll follow their advice.

jpadfield commented 10 months ago

Sorry, fully understand and was not suggesting we make changes to TMS and given our limits I am ok with the recommendation of using a placeholder un-cool URL. I was just think about the structure and potential future use of the placeholder un-cool URL.

The CIIM does have an API to generate PIDs, so we could generate PIDs for the various terms, which would then given us an alternative placeholder un-cool URL - it would not resolve correctly at the moment, but it would mean that when we do have a vocab server in place and correctly hooked in to the various systems then our temp URLs would then start to resolve and we not need to change our published data. But if that is not an option now then we can't, it was just a thought

However thinking of the current dummy URL might it be better to using something like: https://data.ng.ac.uk/term/Italian_(Florentine) - this could be setup to resolve or redirect to a https://data.ng.ac.uk/NGPID type url in the future if needed and thus would also start to resolve without us changing data ....

Sorry, I started the thought as I was wondering why we wanted to put the type and label into the URL when the type was already covered by the nested AAT reference, but if you are happy and you do not agree with the other two options then we can live with the dummy URL as suggested and then swap them out later when we can

richardofsussex commented 10 months ago

I've dealt with weight, materials statements (two aspects) and empty part arrays, and re-generated all the test records (at the richardofsussex address). I think that's everything that has an action against it: please check and let me know of any others. Correction: I've just spotted 15 and 16 - I'll have a look at those shortly.

richardofsussex commented 10 months ago

15 and 16: having found that I asked Rob S about roles of actors back in 2020 (such foresight!), I see that they were thinking along the attribute assignment path even then. I've had a go at implementing this:

image

Of course, the 'attribute' being asserted here is the role played by the agent in question in the creation activity, which drops us smack into the middle of the 'property of a property' debate. I've taken the opportunity of this attribute assignment being semi-detached from the core structure to simply name the property as what it is: P14.1_in_the_role_of. What do you think?

As regards 16, I am already picking up the prefix and suffix, and prepending/appending them to the agent's _label. Is this sufficient as a pragmatic solution?

richardofsussex commented 10 months ago

In 13, I assume you mean that these fields should each get their own referred_to_by block within the bibliographic citation, not in the top-level referred_to_by block?

richardofsussex commented 10 months ago

On that assumption, this is what we now have: image

RGShepherd commented 10 months ago

Further to 8 / 26:

The "department" thing - might that make more sense as the painting being a member of a group? It might make it clearer for external users ... objects may well belong to a lot of groups going forward - "Objects on display", "Lined paintings", "X-rayed Paintings", etc .... I do not have a specific use case here, just thinking that as we are starting to classify paintings for example in relation to environmental conditions groups might be a nice way to go, as we can then have nice scope notes for the groups .....

All of these are groups of objects that can be returned by searching data, as with department. To set them up as groups as well is to denormalise the data and open up scope for inconsistency to creep in. So let's not over-complicate things to account for against speculative suggestions, and just resolve 8.

jpadfield commented 10 months ago

I am not able to see a reference to "department" in the current exports, so it is hard to see what has changed (one of the reasons for versioning things #26). My concern here is more related to consistency - if we are applying classes to our objects we should just be consistent in how it is done, where-ever the data comes from in the CIIM. To an external user stating that a painting is "Main Collection" or "On Display" could be seen as very similar types of data - but obviously, in this case one of these is generally fixed whereas the other is more dynamic.

If we are now being consistent then great - if not, then I would say that if a bit more technical work here could simplify things for users in the future and will make the required documentation easier, it may well be worth it.

Any "group" of paintings can be returned based on a search - I was not specifically needing the example groups I mentioned, or even that we should start creating new groups, just in case people might need them. I was just considering that we determine an agreed method of modelling a paintings membership in a group, then just use it .....

But again, if that is what we are now doing then great :-)

richardofsussex commented 10 months ago

Here is a proposed implementation for 8:

image

I think that works quite nicely, with the department and the gallery both being present. How does it strike you?

RGShepherd commented 10 months ago

@richardofsussex , I was on the point of saying 'yes' re. 13 when a meeting started; this looks great, thanks. Resolved, I think.

RGShepherd commented 10 months ago

Here is a proposed implementation for 8:

Alas, as per my comment - https://github.com/national-gallery/NG-CIIM/issues/24#issuecomment-1674898474 - our 'departments' aren't really departments (it's slightly misleading TMS terminology). Let's keep the Gallery as the current_custodian, but move our department data back into the top-level classified_as block with a type of http://vocab.getty.edu/aat/300438433, 'status notes'.

RGShepherd commented 10 months ago

Re. 7/24:

Sorry, I started the thought as I was wondering why we wanted to put the type and label into the URL when the type was already covered by the nested AAT reference, but if you are happy and you do not agree with the other two options then we can live with the dummy URL as suggested and then swap them out later when we can

We will create PIDs for these when we've been able to change the way we handle them in TMS. Until then, we're not going to start trying to manage them using the PID API, which is primarily built to serve PIDs to systems, not try and maintain ad-hoc terms like this by hand. Neither will we set up redirects by hand to do the same job.

richardofsussex commented 10 months ago

OK: I was misled by the TMS framework. Now looks like this: image and: image

RGShepherd commented 10 months ago

@richardofsussex , so that I can sign off 1, please could you create a record for object-8690?

richardofsussex commented 10 months ago

Done: http://richardofsussex.me.uk/ng/ciim7-output/object-8690.json

RGShepherd commented 10 months ago

And that's 1 resolved; many thanks!

RGShepherd commented 10 months ago

As regards 16, I am already picking up the prefix and suffix, and prepending/appending them to the agent's _label. Is this sufficient as a pragmatic solution?

I think my preference would be to go full 'style-of', as per the model; but that runs us into a problem where these are very definitely free-text values and can't be mapped to a PID. Back to our un-cool URIs?

RGShepherd commented 10 months ago

19 is now looking good - except that we now have some blank nodes in referred_to_by.content where classified_as.label:"Material Statement".

RGShepherd commented 10 months ago

So I think we're left with 11 (object-1550), 12 (object-1550), 14 (which I can't test because object-1775 throws a 404 error), 16 (object-1550) and 19 (object-1540) to resolve.

richardofsussex commented 10 months ago

I think I've dealt with 11 and 12. The original test case for 14 was object-1755, but neither this nor object-1775 (now available) have a 'lending' key, so I'm puzzled as to what it would prove. I don't see the blank nodes for 19 which you refer to: screenshot please.

RGShepherd commented 10 months ago

I don't see the blank nodes for 19 which you refer to: screenshot please.

I think this happens when we have both keywords and a string - see object-4668, where we have under referred_to_by:

{
    "type": "LinguisticObject",
    "classified_as": [{
            "id": "http://vocab.getty.edu/aat/300435429",
            "type": "Type",
            "_label": "Material Statement",
            "classified_as": [{
                    "id": "http://vocab.getty.edu/aat/300418049",
                    "type": "Type",
                    "_label": "Brief Text"
                }
            ]
        }
    ],
    "content": ""
},
{
    "type": "LinguisticObject",
    "classified_as": [{
            "id": "http://vocab.getty.edu/aat/300435429",
            "type": "Type",
            "_label": "Material Statement",
            "classified_as": [{
                    "id": "http://vocab.getty.edu/aat/300418049",
                    "type": "Type",
                    "_label": "Brief Text"
                }
            ]
        }
    ],
    "content": ""
},
{
    "type": "LinguisticObject",
    "classified_as": [{
            "id": "http://vocab.getty.edu/aat/300435429",
            "type": "Type",
            "_label": "Material Statement",
            "classified_as": [{
                    "id": "http://vocab.getty.edu/aat/300418049",
                    "type": "Type",
                    "_label": "Brief Text"
                }
            ]
        }
    ],
    "content": "Oil on canvas"
},
RGShepherd commented 10 months ago

@richardofsussex , thanks - I think we now have:

  1. Resolved.
  2. Resolved (at least in object-1550 - I assume you haven't generated a new object-4668 yet?)

\14. Well as there's no longer a key, let's just treat this as resolved.

Which leaves just 16 and 19.

richardofsussex commented 10 months ago

19 done: see object-4668 (also for confirmation of 12).

16, to remind ourselves:

Which leads me to qualified attributions, as contained in creation.maker.@link.prefix and creation.maker.@link.suffix; this looks like a case of https://linked.art/model/assertion/#style-of-attribution. (object-1550; but note that here it relates to 'workshop of', which Linked Art, being pedantic, counts as an entity in its own right; really, does anyone record their data with this level of semantic precision?)

Your indication of a qualified attribution is, let's say, subtle. In object-3520, the third and fourth members of the maker array have role.[].value: "Previous attribution". Only the first of these has a prefix. It seems to me that the Previous attribution role is a better indicator of a qualified attribution than the presence of prefix or suffix.

richardofsussex commented 10 months ago

This is an attempt at a qualified attribution:

image

I think I've got the logic right: this is distinct from 'style of' in that we thought he actually painted the work (but we no longer do). Do you agree? Is this a general solution, or are there other types of qualified attribution we need to address? (See object-3520)

RGShepherd commented 10 months ago

@richardofsussex , sorry - things seem to come up just when I'm on the point of replying to you. And we have the additional problem that the distinctions are obvious to me, 'cos it's my data 😬 Before I check, there are two kinds of qualification:

  1. We thought it then, but we don't now; or, somebody else thinks it but we don't agree. This is indicated by anything other than 'artist' in creation.maker.@link.role.value, and is where we would look at adding an assertion. Let's say these are attribution types.
  2. We don't think it's by the artist, but by somebody connected with the artist (to a greater or lesser degree). This is indicated by values in creation.maker.@link.role.prefix and/or creation.maker.@link.role.suffix; these are where https://linked.art/model/assertion/#style-of-attribution would apply. But we hit the problem (again) of not being able to map them to AAT terms because TMS only lets us enter the data as free text (i.e. not even a controlled terminology), hence my reference to our un-cool URIs. These are what I consider to be attribution qualifiers.

And of course the two could be combined, e.g. a previous attribution involves an attribution qualifier.

RGShepherd commented 10 months ago

I've posted something on the secret TMS Slack to see how others are addressing this problem of not being able to map to URLs - might get something from the business end of Yale, rather than the shiny Lux end.

richardofsussex commented 10 months ago

There is no problem with changing my test from "matches Previous attribution" to "doesn't match Artist". In the case of object-3520, this would yield the same result, so feedback on what I have produced (above) would be helpful. Then you need to tell me what other role values might be present, and we need to consider how they should be mapped.

RGShepherd commented 10 months ago

I have to say I'm finding the precise modelling for assignment in this context a bit of a brain-bender. But:

It strikes me that separating out non-artist attribution types into a separate assigned_by top-level block makes some kind of sense: we're more interested there in the pattern of assignments than the actual attribution. But given that everything in our produced_by.carried_out_by block is therefore what we would consider to be a definitive statement, does it need the child assigned_by blocks?

But how this relates to attribution qualifiers is harder to unpick. I'd question the fundamental assertion made in the Linked Art model documentation that 'The assessment of "style of" attribution is a judgement decision that might be changed later as new evidence of the actual creator comes to light.' Frankly, this applies to all attributions unless the work is conclusively documented, whether they contain a qualifier or not. (Someone can be as-near-as-dammit certain on stylistic grounds that a work is entirely by a named artist, and that opinion be widely accepted; we would describe the work as 'by' the artist.) We certainly don't distinguish in the system between rock-solid documented attributions and ones made on stylistic grounds. So I suppose that means, in LA terms, that we should include produced_by.carried_out_by.assigned_by everywhere.

Except that, according to https://linked.art/example/object/18, 'style of' attributions aren't included in the produced_by.caried_out_by block at all, just in top-level assigned_bys, which means no-one will find them in the former. Is this a question for LA listserv?

To answer you other question, possible role (attribution type) values are:

As far as attribution qualifiers (prefix / suffix) go, these are too varied just now to be sensibly mappable within our LA transformations. I've noted the need for an improvement in https://github.com/national-gallery/NG-CIIM/issues/13#issuecomment-1693378298.

Sorry for the slow and long reply; we seem to be back to CRM-like theological discussions ...

RGShepherd commented 10 months ago

In summary: