googlefonts / diffenator2

A font comparison tool that will not stop until your fonts are exhaustively compared.
Apache License 2.0
46 stars 6 forks source link

Some words in Adlam.txt look improperly parsed #64

Open NeilSureshPatel opened 1 year ago

NeilSureshPatel commented 1 year ago

There are number of words in the Adlam.txt that look to be improperly parsed. These all have a capital "Na" in the middle of the word followed by a "nyondal" (apostrophe) and a "ba".

Sample: 𞤤𞤫𞤴𞤯𞤫𞥅𞤼𞤫𞤐'𞤦𞤫𞤤𞤫 𞤥𞤢𞤳𞥆𞤮𞤐'𞤦𞤫𞤤𞤫

It looks like these are two words that are merged. Perhaps these started out as . . .

𞤤𞤫𞤴𞤯𞤫𞥅𞤼𞤫 𞤐'𞤦𞤫𞤤𞤫 𞤥𞤢𞤳𞥆𞤮 𞤐'𞤦𞤫𞤤𞤫

I was going to clean these up but I think it maybe related to an upstream issue of how the words are extracted from the source material. Even if the words are separated as shown above the text would not be quite right. Word initial pre-nasalized consonants in Adlam do not use the nyondal.

These words would properly be written as 𞤐𞤦𞤫𞤤𞤫 and 𞤐𞤦𞤫𞤤𞤫. So maybe there are two issues here, one being the parsing and the other being the source material.