Open flapka opened 5 years ago
For library collections, gender data is not directly available in our MARC records. But if we use the name headings in MARC records to query data sources such as VIAF or WikiData, gender data can be harvested for a significant portion of the creators/contributors to library objects.
To explore: find the most effective and sustainable method for harvesting the data from VIAF etc.
@flapka Should we turn our attention to this again?
@edgartdata I hope this might be in scope of the Mellon grant activity. Is that reasonable?
@edgartdata @yulgit1 In the long term, I'd like to retrieve gender information from data in Wikidata, etc.
But for now -- and since we have some time on our hands for special projects -- I'd like to take a stab at mapping gender info from our MARC records.
Bonnie and I plan to add such data in the coming weeks. Our initial focus will be to add gender data where we know that the item creator or contributor(s) are women. For such items, the MARC will have this field: 386__ $m Gender group $a Women $2 lcsh
The question that remains is what we'd do for the (large portion of) library records that won't have this field? Do we prefer:
Thoughts?
@flapka "Women/man/nonbinary/undermined" seems ok but just "woman/undetermined" seems incomplete.
Maybe a field with a label like "Women Creator: true/unknown"?
Also "female" vs "woman"? Thinking of how I'd catalog a picture by my 6yo niece (if there are any cases like this in the museum).
Incomplete yes, which is why perhaps it makes sense to leave the value unmapped where we don't have a 386 field.
For the XSLT I imagine something along the lines of : where 386 $a = "Women" set the gender value to "female" -- for consistency with art collections data.
@yulgit1 @edgartdata
Circling back to this. Do we need anything more to implement?
As noted above, this is interim solution for RB material.
addressed: https://git.yale.edu/ermadmix/ycba_xslts/commit/8a7dbffc0288d765197947d5d855338d83f4589e
example: http://ycba-collections-dev.herokuapp.com/catalog/orbis:13557713.json
full indexing over the weekend
Review the gender data comes through nicely next week.
Earlier this month, the Library of Congress Program for Cooperative Cataloging issued a Revised Report on Recording Gender in Personal Name Authority Records (PDF).
The report makes a new recommendation:
Do not record the RDA gender element (MARC 375) in personal name authority records. Delete existing 375 fields when editing a record for any other reason.
The report provides the following rationale:
Gender, like many other attributes of persons specified in RDA, is an optional element that the PCC and cataloging community considered potentially useful to record in Name Authority Records for the following purposes: ● Distinguishing among persons with same or similar names ● Providing contextual information for users and catalogers ● Identifying persons known by phrases, pseudonyms, initialisms, ambiguous names, or names unfamiliar to catalogers or users ● Facilitating searches that limit to a particular gender (e.g., female composers) ● Facilitating grammatical labeling of entities and relationships in data displays, especially in languages with grammatical gender (e.g., autor/autora in Spanish, Herausgeber/Herausgeberin in German). In some languages, romanization, pronunciation, and case conversion of a name may change depending on the associated gender.
However these considerations do not outweigh the potential risks involved in recording gender for the following reasons: ● The primary goal of authority data is for disambiguation, not contextual biographical information ● It is not the role of the cataloger to determine and record personally Identifiable Information (PII) in authority and bibliographic data. ● Gender identity, the vocabulary used to describe it, and the degree to which individuals are able to and choose to disclose it, are complex, contextual, personal, and subject to change over time and in different environments and jurisdictions. ● Trans and non-binary individuals in particular are more likely to experience negative consequences (such as discrimination, psychological harm, and violence) as a result of of being misgendered (when a gender identity is incorrectly imposed on them by someone else), “outed” (when their gender identity is disclosed by someone else without their consent), or “deadnamed” (when a given or birth name is disclosed or used without their consent), whether intentionally or not. ● As library data is increasingly opened and repurposed, there could be additional unforeseen and potentially irreversible ramifications for recording gender information.
I agree sufficiently with the recommended outcome, for the reasons outlined in the final three bullets above (though I don't entirely agree with the two bullets that precede those).
Following the policy above, RBM should delete gender data from 899 object records. Frankly, the gender data in that set was insufficiently sourced to begin with (the product of a too-hasty-pandemic-project), so there's warrant to roll it back on multiple counts.
The online collections group notes the following, in discussion of the new LC/PCC policy:
FYI an event on May 18th: https://artinformationcommons.github.io/2022-05-18-art-in-context-identity-ethics-and-insight-symposium/
The symposium on Identity, Ethics, and Insight @yulgit1 mentioned above is organized by colleagues at the Philadelphia Museum of Art and Library ang I highly recommend it.
Thanks for the meeting minutes @flapka. I'd like to add that keeping gender information in the collections catalog may also be helpful to/welcomed by people with under-represented gender identities who want to be visibly accounted for by institutions. Not publishing gender data is not necessarily an inclusive approach from that standpoint. This topic obviously requires a thoughtful and longer conversation and the next step will be to meet with the curatorial division.
For all collections, it sounds like we want to broaden the scope of the Gender facet so that it queries the gender of all known contributors -- not just the primary creator.