langcog / peekbank-data-import

data import code for peekbank
2 stars 3 forks source link

language coding for low-resource languages #4

Open JMankewitz opened 3 years ago

JMankewitz commented 3 years ago

We can't find tseltal in iso639-2B, but that encoding is enforced. Coding as multiple for now, but should we have an "other" option?

marisacasillas commented 3 years ago

if we're going to have an "other" category, we'll want to document whatever detail about the language that we have in some other field—sorry if that's already there and i missed it.

adriansteffan commented 1 week ago

@mzettersten Let's take a look at this in one of our next meetings. I would suggest we add details for niche languages in the trial_type_aux_data and subject_aux_data, as these are the two tables that need the ISO639-2B format in one of their fields.

Once we have decided on the format and documented it in the column info table, I will go ahead and fix the affected import scripts

mzettersten commented 4 days ago

Ok, Adrian and I made two decisions: