Closed AaronGullickson closed 3 years ago
Ok, I have looked through this and the general codes are definitely not good enough. They provide a lot of detail for European languages but then lump vast geographic regions together (e.g. "Sub-Saharan African").
The detailed codes work well in many cases, although the 1980 data often provide too much detail because nothing was re-coded from what respondents provided. Ideally, I would also want to look at a linguistic measure of language similarity but this would be quite an undertaking.
I think what I will do is follow a procedure of using the detailed codes with adjustments made to make the two time period more comparable. Specific changes as follows.
In general, I will use the detailed codes with the following adjustments:
There is also the problem of groupings of "other" or "nec" languages. These should all probably be given one code and not be considered endogamous with anything else. These would be the following codes:
>=
9300 , all the other NEC codes
I need to do a deeper dive into the difference between language and languaged to make sure I am capturing the relevant categorization whenever possible.