bodleian / ora_data_model

Documentation and crosswalks relating to the ORA data model
1 stars 1 forks source link

Author order not harvesting correctly via crosswalks #181

Closed mrdsaunders closed 4 years ago

mrdsaunders commented 4 years ago

Author metadata order in ORA is recorded in role_order, which is created by the deposit crosswalk. However, author order is not harvesting from ORA correctly, since the engine cannot sort on . The order in the SE ORA record reflects the order of the elements, and not the values. From AB:

I'm afraid the crosswalk engine does not currently have the capability of specifying an adjusted order for multivalued items (like author lists). During deposit, the authors are iterated over in the order they appear, so the elements will be in sequence (1, 2, 3, etc). When harvesting back, there is an implicit assumption that the order of the elements in the input document is the same as the ordering of the . I take it this is not the case? I hadn't really considered that the incoming data would not be ordered already.

I believe this to be a significant and immediate concern for us, affecting both departmental websites and also REF review, REF acceptance and REF attribution based on contribution. Temporary mitigation by reducing the precedence of the ORA record would have significant implications for the smart report since it would affect OAM data.

AB has suggested pre-sorting authors prior to the Sword endpoint. Failing that he could investigate amending the engine but this would likely use the upgraded engine that will come with 5.20.

My initial question is what resource and timescale would be involved in pre-sorting author lists, based on role_order, pre-Sword?

mrdsaunders commented 4 years ago

As an aside, given that contributor items deposit in sequence, what is the process by which they get out of order other than an error in order on the deposited record?

mrdsaunders commented 4 years ago

Andrew Bennett has offered to correct this temporarily (pending incorporation into the crosswalks engine) in a patch due in about a fortnight.

mrdsaunders commented 4 years ago

Just discussing a spec for the fix with Andrew Bennett. He will presort mods:name element according to ora:role_order element value.

An easy approach is to just pre-sort elements with mods:roleTerm equal to Author. But then what about Editors? Do they need to be sorted?

Can I just confirm that ORA reviewers do add a role_order value for editors?

I believe we should ask him to take that into account should there be multiple translators.

tomwrobel commented 4 years ago

role order is added to all roles, including editor and translator

mrdsaunders commented 4 years ago

So according to the deposit crosswalk, we do send a role_order value for: Author Editor Translator Contributor Depositor (not sure how there can be more than 1 though...)

mrdsaunders commented 4 years ago

From AB:

is it possible to have a mods:name element with roleTerms of - say - Author and Editor? In this case, how should the ordering occur? It's very possible I'm over complicating this, and that in pratice none of these complexities will occur, but I'd like to ensure that we've explicitly discussed this!

However unlikely, the SE data model does permit a person to be an author, editor, translator and associated author (contributor) currently. What about in ORA?

tomwrobel commented 4 years ago

absolutely normal

A contributor can have multiple roles, each role has a title and a role order. So a person can have:

contributor:
  display_name: "Tom Wrobel"
  # otherfields
  roles:
    - role_title: author
      role_order: 2
    - role_title: editor
      role_order: 3
    - role_title: depositor
      role_order: 1 
mrdsaunders commented 4 years ago

Can there be more than one depositor? I assume so where there are multiple deposits?

mrdsaunders commented 4 years ago

But for this matter in question, we would never need to harvest a depositor list back to SE.

tomwrobel commented 4 years ago

for us, depositors are ignored beyond the first one

On 16 Mar 2020, at 12:42, David Saunders notifications@github.com wrote:



Can there be more than one depositor? I assume so where there are multiple deposits?

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHubhttps://github.com/tomwrobel/ora_data_model/issues/181#issuecomment-599514220, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AAHYQR346G5QAHSPSHWFWTDRHYNB7ANCNFSM4LFVX6LQ.

mrdsaunders commented 4 years ago

AB has fixed - for 5.19 without patch and then in 5.20. see link to #182