UMNLibraries / cdm-blacklightify

Blacklightify a CONTENTdm collection
MIT License
2 stars 0 forks source link

Multi-language support #65

Open mberkowski opened 2 years ago

mberkowski commented 2 years ago

CDM will start storing some multi-lingual things that may need multi lingual text descriptions, rights statements etc. These would come from CDM as alternate fields, we might need to store them in separate Solr fields and display.

JR does not have thoughts yet on how to implement the UI. La Prensa newspaper might be the test case for this.

This is NOT an i18n of all of UMedia, rather a potential swap of display value on certain collections/publications

mberkowski commented 2 years ago

@jasonoroy This was on the GH board, not trello. When originally entered, I put "JR does not have thoughts yet on how to implement the UI" -- if you do now have thoughts on it, feel free to expand them here.

jasonoroy commented 2 years ago

Within the metadata schema, there would only be a few Spanish-language fields that we would need to swap out. Here is the list of the current field and their suggested Spanish language equivalent fields:

  1. DESCRIPTION & DESCRIPTION - SPANISH
  2. ITEM PHYSICAL FORMAT & ITEM PHYSICAL FORMAT - SPANISH
  3. LOCALLY ASSIGNED SUBJECT HEADINGS & LOCALLY ASSIGNED SUBJECT HEADINGS - SPANISH
  4. COUNTRY & COUNTRY - SPANISH
  5. CONTINENT & CONTINENT - SPANISH
  6. LOCAL RIGHTS STATEMENT & LOCAL RIGHTS STATEMENT - SPANISH
  7. RIGHTS STATEMENT URI & RIGHTS STATEMENT URI - SPANISH
  8. FISCAL SPONSOR & FISCAL SPONSOR - SPANISH

View current metadata spreadsheet

In addition we will need to translate all the field titles even if they contain the same information in either language (e.g. Date of Creation or Fecha de creación; Title or Título), though I'm not sure how you want to deal with storing that information.

Overall idea would be to be able to toggle between metadata details in both English and Spanish.

mberkowski commented 1 year ago

@jasonoroy do you know yet what the internal metadata field names will be? The short names as exposed by the cdm JSON API. These will be read and indexed by the CDMDEXER ruby gem, but need to be predefined in there. I don't see any Spanish fields defined in the CONTENTdm fields admin screen yet but also not sure I'm looking in the right place. e.g https://cdm16022.contentdm.oclc.org/digital/bl/dmwebservices/index.php?q=dmGetItemInfo/p16022coll609/623/json

{
  "title": "Chicano Cultural Center Proposal (Box 1, Folder 17)",
  "altern": {},
  "creato": {},
  "contri": {},
  "publis": {},
  "descri": {},
  "captio": {},
  "additi": {},
  "projea": {},
  "date": "1975 - 1976",
  "histor": {},
  "type": "Mixed material",
  "format": {},
  "dimens": {},
  "subjec": {},
  "fast": {},
  "langua": "English; Spanish",
  "transc": {},
  "transl": {},
  "city": "Minneapolis",
  "state": "Minnesota",
  "countr": "United States",
  "region": {},
  "contin": "North America",
  "projec": {},
  "scale": {},
  "coordi": {},
  "geonam": {},
  "a": "Chicano Studies Records (ua2018-0004); https://archives.lib.umn.edu/repositories/14/resources/7584",
  "contra": "University of Minnesota Libraries, University Archives.",
  "contac": "University of Minnesota Libraries, University Archives. 218 Elmer L. Andersen Library, 222 - 21st Avenue South, Minneapolis, MN 55455; https://www.lib.umn.edu/uarchives",
  "local": "Use of this item may be governed by US and international copyright laws. You may be able to use this item, but copyright and other considerations may apply. For possible additional information or guidance on your use, please contact the contributing organization.",
  "righta": {},
  "expect": {},
  "addita": {},
  "fiscal": {},
  "identi": "ua2018-0004, Box 1, Folder 17",
  "barcod": {},
  "system": {},
  "dls": "ua2018-0004-box01-fdr17",
  "kaltur": {},
  "kaltua": {},
  "kaltub": {},
  "kaltuc": {},
  "kaltud": {},
  "persis": {},
  "umedia": "yes",
  "attach": {},
  "featur": {},
  "fullrs": {},
  "find": "624.cpd",
  "dmaccess": {},
  "dmimage": {},
  "dmcreated": "2022-08-16",
  "dmmodified": "2022-08-16",
  "dmoclcno": {},
  "dmrecord": "623",
  "restrictionCode": "1",
  "cdmfilesize": "1027",
  "cdmfilesizeformatted": "0.00 MB",
  "cdmprintpdf": "1",
  "cdmhasocr": "0",
  "cdmisnewspaper": "0"
}
jasonoroy commented 1 year ago

@mberkowski I have not yet created the bilingual collection in CONTENTdm so cannot yet provide you with the shortened nicknames for the fields. I would like to discuss with you a couple of options for creating this collection to gauge your opinion. Perhaps we can discuss at our next meeting.

jasonoroy commented 1 year ago

@mberkowski

Ok, all newspapers have been uploaded to CONTENTdm, but remains unpublished and as such not available through UMedia. Collection ID is p16022coll613

New Spanish language fields are as follows, along with their CDM nicknames

sp_Description | spdesc sp_Additional Notes | spaddi sp_Item Physical Format | spitem sp_Locally Assigned Subject Headings | sploca sp_Language | splang sp_Country | spcoun sp_Continent | spcont sp_Local Rights Statement | splocb sp_Rights Statement URI | sprigh

Let me know if you need me to customize the nicknames for better clarity.

mberkowski commented 1 year ago

Thanks @jasonoroy - I'll start setting these up in cdmdexer

mberkowski commented 1 year ago

CDM e.g. https://cdm16022.contentdm.oclc.org/digital/bl/dmwebservices/index.php?q=dmGetItemInfo/p16022coll613/48/json