Aim: Reduce number of columns for better manageability of data frame
Proposal: Group names given in the MODS-file according to given roles
Explanation: Each "name" entry in the mods-file consists of at least four parts:
"nameXX_namePart.family"
"nameXX_namePart.given"
"nameXX_displayForm"
"nameXX_role_roleTerm"
The number of columns could significantly be reduced if the names would first be grouped according to the roles and then concatenated into a fewer number of columns.
Examples:
PPN735425078 contains 76 names with the role "asn" (= associated name); this amounts up to 304 columns, but could be reduced to three columns ("nameASN_namePart.family", "nameASN_namePart.given", "nameASN_displayForm"), each containing 76 names in nested form (Mauschwitz; Baudis; Hoberg; ...)
PPN858144891 contains 50 names with the role "oth" (= other); this amounts up to 200 columns, but could be reduced to three columns ("nameOTH_namePart.family", "nameOTH_namePart.given", "nameOTH_displayForm")
PPN1774254956 contains 42 names with the role "ctb" (= contributor); this amounts up to 168 columns, but could be reduced to three columns ("nameCTB_namePart.family", "nameCTB_namePart.given", "nameCTB_displayForm")
The most frequently used roles are asn (associated name), oth (other), ctb (contributor), dte (dedicatee), fnd (funder), auth (author), isb (issuing body), egr (engraver), hnr (honoree), ill (illustrator), prt (printer).
Aim: Reduce number of columns for better manageability of data frame Proposal: Group names given in the MODS-file according to given roles Explanation: Each "name" entry in the mods-file consists of at least four parts: "nameXX_namePart.family" "nameXX_namePart.given"
"nameXX_displayForm" "nameXX_role_roleTerm" The number of columns could significantly be reduced if the names would first be grouped according to the roles and then concatenated into a fewer number of columns. Examples: PPN735425078 contains 76 names with the role "asn" (= associated name); this amounts up to 304 columns, but could be reduced to three columns ("nameASN_namePart.family", "nameASN_namePart.given", "nameASN_displayForm"), each containing 76 names in nested form (Mauschwitz; Baudis; Hoberg; ...) PPN858144891 contains 50 names with the role "oth" (= other); this amounts up to 200 columns, but could be reduced to three columns ("nameOTH_namePart.family", "nameOTH_namePart.given", "nameOTH_displayForm") PPN1774254956 contains 42 names with the role "ctb" (= contributor); this amounts up to 168 columns, but could be reduced to three columns ("nameCTB_namePart.family", "nameCTB_namePart.given", "nameCTB_displayForm") The most frequently used roles are asn (associated name), oth (other), ctb (contributor), dte (dedicatee), fnd (funder), auth (author), isb (issuing body), egr (engraver), hnr (honoree), ill (illustrator), prt (printer).