languages with more than one area

In this table register.csv there are languages with the same glottocodes which are associated with more than one area.

oira1263 for example is associated with both Inner Asia and Oceania. This seems to be because one of them should have the glottocode kalm1243, not oira1263 (LID = 1343).

There are 12 cases like this. I think each should be gone through and probably the glottocode & ISO 639-3 changed.

 1 oira1263   
 2 toho1245   
 3 tibe1272   
 4 indo1316   
 5 kyer1238   
 6 balk1252   
 7 east2295   
 8 kati1270   
 9 mart1256   
10 noga1249   
11 taha1241   
12 peri1253

Here's a way of finding them using R-code.

library(tidyverse)
AUTOTYP <- read_csv("data/csv/Register.csv"  ,col_types = cols()) %>% 
  distinct(Glottocode, Area, .keep_all = T) %>% 
  mutate(dup = duplicated(Glottocode) + duplicated(Glottocode, fromLast = T)) %>% 
  filter(dup > 0)

Some of them make sense, like Tuareg (Air) (LID = 1420) and Tuareg (Ghat) (LID = 1421). The long lat of the varieties probably merits the different areas.

autotyp / autotyp-data

languages with more than one area #49