ArctosDB / arctos

Arctos is a museum collections management system
https://arctos.database.museum
Apache License 2.0
59 stars 13 forks source link

Code Table Request - language #7781

Open wellerjes opened 2 months ago

wellerjes commented 2 months ago

Initial Request

Goal

Describe what you're trying to accomplish. This is the only necessary step to start this process. The Committee is available to assist with all other steps. Please clearly indicate any uncertainty or desired guidance if you proceed beyond this step.

create an attribute to record physical media

Context

Describe why this new value is necessary and existing values are not.

creating new attributes that can be applicable for cultural collections and archives currently the only attribute that refers to language is "Indigenous term"; we want to be able to record language spoken or written in media, such as motion film, audio or archival documents

Table

Code Tables are http://arctos.database.museum/info/ctDocumentation.cfm. Link to the specific table or value. This may involve multiple tables and will control datatype for Attributes. OtherID requests require BaseURL (and example) or explanation. Please ask for assistance if unsure.

https://arctos.database.museum/info/ctDocumentation.cfm?table=ctattribute_type

Proposed Value

Proposed new value. This should be clear and compatible with similar values in the relevant table and across Arctos.

language

Proposed Definition

Clear, complete, non-collection-type-specific functional definition of the value. Avoid discipline-specific terminology if possible, include parenthetically if unavoidable.

A language of the resource. Via https://www.dublincore.org/specifications/dublin-core/dcmi-terms/ http://purl.org/dc/terms/language

Collection type

_Some code tables contain collection-type-specific values. collection_cde may be found from https://arctos.database.museum/home.cfm_

AV, ARCH, EH, Art (could apply to all)

Attribute Extras

Attribute data type

If the request is for an attribute, what values will be allowed? free-text, categorical, or number+units depending upon the attribute (TBA)

free-text

categorical

Attribute controlled values

If the values are categorical (to be controlled by a code table), add a link to the appropriate code table. If a new table or set of values is needed, please elaborate.

n/a

https://www.loc.gov/standards/iso639-2/php/code_list.php

Attribute units

if numerical values should be accompanied by units, provide a link to the appropriate units table.

n/a

Part preservation attribute affect on "tissueness"

if a new part preservation is requested, please add the affect it would have on "tissueness": No Influence, Allows, or Denies

n/a

Priority

Please describe the urgency and/or choose a priority-label to the right. You should expect a response within two working days, and may utilize Arctos Contacts if you feel response is lacking.

Example Data

Requests with clarifying sample data are generally much easier to understand and prioritize. Please attach or link to any representative data, in any form or format, which might help clarify the request.

for films, to document silent film or English subtitles

Available for Public View

Most data are by default publicly available. Describe any necessary access restrictions.

n/a

Helpful Actions

@ArctosDB/arctos-code-table-administrators @ArctosDB/diversity-and-inclusion @mkoo

Approval

All of the following must be checked before this may proceed.

_The How-To Document should be followed. Pay particular attention to terminology (with emphasis on consistency) and documentation (with emphasis on functionality). No person should act in multiple roles; the submitter cannot also serve as a Code Table Administrator, for example._

Rejection

If you believe this request should not proceed, explain why here. Suggest any changes that would make the change acceptable, alternate (usually existing) paths to the same goals, etc.

  1. Can a suitable solution be found here? If not, proceed to (2)
  2. Can a suitable solution be found by Code Table Committee discussion? If not, proceed to (3)
  3. Take the discussion to a monthly Arctos Working Group meeting for final resolution.

Implementation

Once all of the Approval Checklist is appropriately checked and there are no Rejection comments, or in special circumstances by decree of the Arctos Working Group, the change may be made.

Close this Issue.

DO NOT modify Arctos Authorities in any way before all points in this Issue have been fully addressed; data loss may result.

Special Exemptions

In very specific cases and by prior approval of The Committee, the approval process may be skipped, and implementation requirements may be slightly altered. Please note here if you are proceeding under one of these use cases.

  1. Adding an existing term to additional collection types may proceed immediately and without discussion, but doing so may also subject users to future cleanup efforts. If time allows, please review the term and definition as part of this step.
  2. The Committee may grant special access on particular tables to particular users. This should be exercised with great caution only after several smooth test cases, and generally limited to "taxonomy-like" data such as International Commission on Stratigraphy terminology.
Jegelewicz commented 2 months ago

The DCMI link should be - http://purl.org/dc/terms/language

Recommended practice is to use either a non-literal value representing a language from a controlled vocabulary such as ISO 639-2 or ISO 639-3, or a literal value consisting of an IETF Best Current Practice 47 [IETF-BCP47] language tag.

Can we just make use of one of those resources?

https://www.iso.org/iso-639-language-code

https://www.loc.gov/standards/iso639-2/php/code_list.php

wellerjes commented 1 month ago

Please move forward with the https://www.loc.gov/standards/iso639-2/php/code_list.php

Thanks!

Nicole-Ridgwell-NMMNHS commented 1 month ago

Referring to the library of congress standard, for our code table, would we use the iso code or English name of Language. I wouldn't think we would want to use a code because users would not understand the code.

wellerjes commented 1 month ago

I wouldn't think we would want to use a code because users would not understand the code.

Agreed!

Jegelewicz commented 1 month ago

for our code table, would we use the iso code or English name of Language.

I would think the English name, although the code would be more internationally recognized. We can put the codes in the code table as metadata of the language. But I will point out that we are assuming all of these languages have an English name that would be recognized by the community that created/uses them....

dustymc commented 1 month ago

https://glottolog.org/ exists for just this sort of thing, and I think avoids some biases (eg assumptions that living speakers exist) of other things mentioned here. It's not perfect, but perhaps it's the most appropriate tool for this purpose.

I do like the vague definition, and suspect this could be used for everything field notes and herbarium sheets to incised artifacts.

Jegelewicz commented 1 month ago

Suggested Code table structure for controlled vocabulary

Term Description ISO identifier Search Terms Documentation URL Issue URL
Navajo A Southern Athabaskan language of the Na-Dené family, through which it is related to languages spoken across the western areas of North America. nav Navaho,Diné bizaad,Naabeehó bizaad https://glottolog.org/resource/languoid/id/nava1243 xxxx
dustymc commented 1 month ago

Lots of things in glottolog will (probably??) not have an ISO (and "ISO" does not seem specific enough to be useful), some will (still probably??) have partial overlap, etc., etc. - I think we should just pick a standard and follow it rather than trying to mix-n-match. (And I'm WAY out of my comfort zone with all of this, I could be convinced that we don't have to anticipate everything and something that covers spoken language in film - I think that's the original request - is close enough for now.)

wellerjes commented 1 month ago

I am not familiar with ISO codes; I am open to discussion about which system is the best for Arctos purposes. I am also fine with creating a code table for language and adding to it as needed, that way there is not a huge code table with a lot of unused values. Thoughts?

Jegelewicz commented 1 month ago

I think we should just pick a standard and follow it rather than trying to mix-n-match

We can pick a standard to allow easy-add exemption and maybe that is https://www.loc.gov/standards/iso639-2/php/code_list.php? Then if someone wants a non-iso code language, it just requires sign off by code table managers.

dustymc commented 4 weeks ago

Details to work out, but this is a GO and it does need a new code table, going active for that much.

mkoo commented 4 weeks ago

@wellerjes Please look at https://glottolog.org/glottolog/language Is this good coverage for the languages you need for your material?

CT meeting: we agree that a new CT of language is needed and will accept those found on the collaborative catalog Glottolog.

AJLinn commented 3 weeks ago

@wellerjes Please look at https://glottolog.org/glottolog/language Is this good coverage for the languages you need for your material?

CT meeting: we agree that a new CT of language is needed and will accept those found on the collaborative catalog Glottolog.

Sorry for the delay in commenting - I've been out sick. I'm not thrilled with the way Glottolog describes Alaska Native languages and assume it might have other problem areas. Presumably we will have the ability in managing the CT to customize those languages & references that are not documented accurately for our particular geographic or intellectual area of specialty?

I've asked a linguist colleague of mine to see if she knows of anything that might be a viable alternative to Glottolog.

AJLinn commented 3 weeks ago

@dustymc @mkoo @wellerjes According to my colleague, Glottolog appears to be full of errors and inconsistencies but the alternative, Ethnologue is operated by the Summer Institute of Linguistics, which is an American evangelical Christian nonprofit. They also require registration to access most of their content. You can read more about them on their wikipedia page: https://en.wikipedia.org/wiki/Ethnologue

My main concern is that we retain the ability in our code table to update anything that aligns with our geographic and intellectual areas of expertise and not be wholly constrained to this outside data. It is simply not going to be accurate for many languages.

I see this issue is closed and in Active Development, in next release. I hope these comments are being seen and can be addressed by those interested in the issue.

Jegelewicz commented 3 weeks ago

@AJLinn The only thing Glottolog was suggested for is to simplify additions to the code table (if it is there, then we are OK with adding it). Anyone can request anything, it's just that things not in Glottolog might take a bit longer.

They also require registration to access most of their content.

This means a bit of a barrier to use?

Nicole-Ridgwell-NMMNHS commented 3 weeks ago

If Glottolog is full of errors, perhaps it should not be used to simplify additions . . .

Jegelewicz commented 2 weeks ago

I am reopening this so that we can agree on some source for languages.

AJLinn commented 2 weeks ago

https://www.itsmarc.com/crs/mergedprojects/sorcecod/sorcecod/language_code_and_term_source_codes.htm

This is a resource that lists a number of language authorities, as used in MARC. Glottolog is listed as one of the sources. In checking through the other options, the World Atlas of Language Structures seems to be the best for the languages I'd be using (though still not perfect, but better in my view). I also like their mapping element.

I've reached out to a few additional folks on our campus to see what, if any, authority they use when cataloging in the oral history, Alaska Native language archive, and regular archives.

AJLinn commented 2 weeks ago

Here's the Library of Congress language code authority that my colleagues at the archives use for langugage: https://www.loc.gov/marc/languages/ With the "RDF version here" https://id.loc.gov/vocabulary/languages.html

I'll keep posting suggestions as my connections weigh in.

Jegelewicz commented 2 weeks ago

Library of Congress

I had this listed as the original source - https://www.loc.gov/standards/iso639-2/php/code_list.php

I don't know how different that is from the two sources listed above, but geez....

wellerjes commented 2 days ago

Reopening; discussed in Cultural Collections meeting today; we want this code table to be populated with names of languages from the Library of Congress source, not glottolog.org