Open fxru opened 9 years ago
This is caused by the TEI files having the wrong information about which encoding is used for those words.
Once fixed in the TEI this problem should disappear.
Strictly related to issue #20.
The problem is that those are not necessarily sanskrit words in pwg we have pra1kritisch
which should be rendered prākritisch
, but is a German adjectival form based on the romanisation prākrit
(from प्राकृत prākṛta).
I think the remaining a1
n2
s3
... etc. should be dealt with as atoms and not as context dependend. These words are often English or German words or abbreviation for texts as the one discussed in issue #20
In the case of English or German words (and in these abbreviations) a conversion to Unicode would be best, thus a1
to ā
etc.
I wonder how many genuine undetected Sanskrit words are left in monier and pwg.
Diacritics such as![diacritics](https://cloud.githubusercontent.com/assets/1352294/5462052/bf3ec01a-8570-11e4-91df-95e84310fe16.png)
n2
forṇ
ora1
forā
are not dealt with in pwg and monier. Those appear especially in the references, but also elsewhere. Some occurrences are dealt with in the legacy interface some are not.