galterlibrary / digital-repository

DigitalHub - Institutional Repository for Galter Health Sciences
https://digitalhub.northwestern.edu/
5 stars 1 forks source link

[Export] Additional Organization names #1132

Closed fenekku closed 1 year ago

fenekku commented 1 year ago

I've noticed that some organizations are exported as people in the export. The pattern is e.g.,:

for Ill.) ,World's Columbian Dental Congress (1893 : Chicago

current export returns

{
  "type": "personal",
  "given_name": "Ill.) ",
  "family_name": "World's Columbian Dental Congress (1893 : Chicago"
}

that needs to be replaced by:

{
  "type": "organizational",
  "name": "World's Columbian Dental Congress (1893 : Chicago, Ill.)", 
}

The following should be all of them that need to be fixed as per above:

Ill.) ,World's Columbian Dental Congress (1893 : Chicago Ill.) ,World's Columbian Exposition (1893 : Chicago Ill.). Finance Committee for Illinois ,World's Columbian Dental Congress (1893 : Chicago Ill.). Committee on Essays. ,World's Columbian Dental Congress (1893 : Chicago Ill.). World's Congress Auxiliary ,World's Columbian Exposition (1893 : Chicago Ill.). Committee on Nomenclature. ,World's Columbian Dental Congress (1893 : Chicago Ill.). Board of Trustees,Northwestern University (Evanston Biostatistics,Collaboration Center

You can double-check via this csv: 2022_11_09_names.csv

Meowcenary commented 1 year ago

There are some incredibly shaky rules around identifying a person versus an organization. Since we're in the late stages of the migration process I've increasingly started to hard code rules and I think this is a situation where that would make the most sense.