ukwa / w3act

w3act is an annotation and curation tool for building web archive collections
Apache License 2.0
19 stars 6 forks source link

Metadata enhancements for Document Harvester #691

Open nicolabingham opened 1 year ago

nicolabingham commented 1 year ago

Create amendments to the metadata screen, including facilities to add full FAST topical, geographic and form headings as secondary subjects, notes fields, personal name qualifiers, e.g. date of birth, more levels for complex corporate hierarchies, and role designators, e.g. $e author

ANJ: Suggested implementation steps:

Originally, @anjackson requested that this was implemented as a feature flag so we could switch between the two versions easily. However, having talked through it, I think perhaps the changes are too invasive and difficult to make that manageable. If that does turn out to be the case, we'll just use git branches to manage different versions instead.

anjackson commented 1 year ago

Some questions:

nicolabingham commented 1 year ago

The following documents contain details of the fields to be added, together with the values to implement. We can go through these on Friday.

MetdataScreenSketchv1.pdf

DDHAPT-Metadata-Screen-Requirements-List-v2-MARCXML.xlsx

There is a work request in with Alan Danskin for the MARC XML which I will chase.

nicolabingham commented 1 year ago

The MARC XML has been added in the xlsx file attached to this ticket.

nicolabingham commented 1 year ago

@min2ha Jennie confirmed that there should be two separate fields for "ISBN" and "Invalid ISBN", [Invalid ISBN should be used when the department has misused the ISBN – i.e. used the same ISBN for a PDF and html version which are technically different bibliographic entities] ... so how it is at the moment is correct.

image

nicolabingham commented 1 year ago

@min2ha Jennie has requested four small tweaks to the metadata screen:

1 Topical headings – please label second and third Subdivision boxes as Subdivision 2 and Subdivision 3

image

2 Geographical names – The OCLC control number box should come after Subdivision 2, not before.

image

3 The Add/Remove buttons work nicely but the position and labelling of the boxes should be identical in each line

image

4 All OCLC Fast headings include a mandatory subfield $2fast before the OCLC number. Can this be added by the programme? [the fast number - e.g. fst00809209 needs to be exported to the MARC record and into Aleph (the catalogue)]

image

min2ha commented 10 months ago

@nicolabingham, sorry, this comment refers to Jennie. It's for visual look and feel improvement only. For better user experience I'd recommend to implement data grouping, i.e. records of the same data type could be visually framed. So, in short (from data model perspective), after DH enhancement implementation, we'll have another 6 new entities linked to document with relationship one to many, so it could be convenient to use frames of grouped records of same type/kind .
Initial draft version is deployed on DEV server, hope it'll help to evaluate pros and cons. (development is still in progress. For testing purposes only (anyway data could be added and VIEW/EDIT modes evaluated, but no data validation yet Screenshot from 2023-10-16 15-33-46 Screenshot from 2023-10-16 15-33-20 Screenshot from 2023-10-16 15-29-34 Screenshot from 2023-10-16 15-28-38 ))

min2ha commented 10 months ago

MARC XML TAG fields been added to the document object (reflecting extended data model). Test https://dev.webarchive.org.uk/act/documents/374704/sip