Shavian-info / readlex

The Read Lexicon: a spelling dictionary for the Shavian alphabet following the rhotic Received Pronunciation standard.
MIT License
31 stars 6 forks source link

Strange heteronym #48

Closed rickyweb closed 6 months ago

rickyweb commented 6 months ago

When transcribing this:

John: No, father.

I get the heteronym "number" <-> "no, father", which doesn't make any sense.

·𐑑π‘ͺ𐑯: 𐑯𐑳π‘₯π‘šπ‘Όβ¬Œπ‘―π‘΄, π‘“π‘­π‘žπ‘Ό.

Funnily, its not reproducible if you discard John:

No, father. -> 𐑯𐑴, π‘“π‘­π‘žπ‘Ό.

It can be reproduced with another proper name Mr Fisher: No, father. -> Β·π‘₯𐑼 𐑓𐑦𐑖𐑼: 𐑯𐑳π‘₯π‘šπ‘Όβ¬Œπ‘―π‘΄, π‘“π‘­π‘žπ‘Ό.

But not with a non proper word answers: no, father -> π‘­π‘―π‘•π‘Όπ‘Ÿ: 𐑯𐑴, π‘“π‘­π‘žπ‘Ό. the boy: no, father -> π‘ž π‘šπ‘Ά: 𐑯𐑴, π‘“π‘­π‘žπ‘Ό.

I also get the same phenomenon if I replace father with mother: John: No, mother -> ·𐑑π‘ͺ𐑯: 𐑯𐑳π‘₯π‘šπ‘Όβ¬Œπ‘―π‘΄, π‘₯π‘³π‘žπ‘Ό.

Shavian-info commented 6 months ago

This is because the word 'no' can sometimes stand for 'number', from the word 'numero'. It is, however, an annoyance since the part of speech tagger struggles to distinguish this, and given how rare the 'numero' meaning is, I've just deleted it.