Detecting transliteration systems used in GeoNames data set

The GeoNames data set contains entries like these:

Row 2 here writes "西博寮海峽", which has a NAME_LINK entry to Row 4 "West Lamma Channel". However, while they are both names of the same geographic location, they are not related in transliteration. The actual transliterated row of Row 2 is Row 1, "Sai Puk Liu Hoi Hap".

There are two problems here:

Row 1 should have NAME_LINK pointing to Row 2 (i.e. its NAME_LINK should be -1950489, because Row 2 has this UID and NAME_LINK is supposed to be bi-directional) and should have TRANSL_CD code set to the Cantonese transliteration system because it is generated by transliterating Row 2.
Row 3 is also generated by Row 2, and should have TRANSL_CD code set to the Mandarin transliteration system because it is generated by transliterating Row 2. However, it is unclear what it should be set to because NAME_LINK seems to only support pairing of two entities, not a one-to-many relationship.

The point in this task is to detect that Row 3 comes from Row 2, detect the transliteration system, and pair them in the output we produce.

interscript / geonames-transliteration-data

Detecting transliteration systems used in GeoNames data set #3