TheHockeyist / russian-untransliterator

This is a program to convert Latin text, e.g. zdravstvujte, into its Cyrillic equivalent - здравствуйте.
MIT License
0 stars 1 forks source link

Щ - using the modern sch instead of the traditional shch. #12

Open TheHockeyist opened 7 years ago

TheHockeyist commented 7 years ago

Normally, shch (or some variant thereof) is used for the letter щ. This makes our life easy, since having шч is outright forbidden in Russian words. Find all of the шч's, replace with щ, done. ;-). Forget the letter, and move on.

Of course, there's the rare веснушчатый (freckled), but we can ignore that one, right?

However, sch is now sometimes encountered for щ. Why? The main reason is historical development of the Russian language, and that щ doesn't actually represent a true "shch" sound anymore.

Originally, щ represented a sound like "щт". The sound then hardened into "шт". It still is pronounced "штъ" in Bulgarian. (They even preserve the vowel-like sound of the letter ъ, so it sounds like "shtuh" there. That "uh" sound disappeared in Russian, and it led to the 1917 spelling reform which almost eliminated the letter ъ from the alphabet.)

The sound for щ then changed into something like шч (штш). The ш-like sound transferred to the other part and the т sound drifted towards the middle. Why? Pronunciation is easier in long words. (It's convinient in a word like защищающийся. Is it easier to say за_шт_иштаю_шт_ийся or за_шч_ишчаю_шч_ийся? The Western and Eastern Slavs agreed that the second was easier to say.) You can still hear the sound in Ukrainian today, where щ still makes the шч sound.

The sound softened into "щч" "shch", which was considered normal in Russian for a long time. In fact, it's the normal sound in most of the Slavic languages now, like Polish, etc. If you took Russian classes prior to about 1960 or so, you would have learned that щ = shch, like "щч". That's what you would have learned in the alphabet unit.

One problem with that. щ and ч sound quite similar. In a way, you're really pronouncing "щтщ" every time. Since the pure щ sound didn't exist in Russian at the time, and as speech became faster, the т sound in the middle finally dropped out, and the letter щ represented a pure consonant sound instead of a consonant cluster for the first time in its history. The word защищающийся was slurred so much that the letter щ represented one sound that was neither шч or шт, and all to make pronunciation easier.

And now, the only difference between ш and щ is where your tongue is in your mouth as you pronounce them. They both represent a sound that is like "sh" to English speakers, but щ is further forward like a hissing smile and ш is further back with a low pitch, like you're blowing on top of a glass bottle to play a note. This is the only difference between the two sounds now, and a lot of English speakers don't notice.

They're still different sounds, though. Using shch for щ now is traditional and accepted, but it now seems a bit old-fashioned and... obsolete. It doesn't represent the true sound of the letter anymore, which is more like sh. So to try to transliterate, some people are settling on sch now, which is like sh, but different. Enough to make it distinct from sh and to not cause problems when transliterating Russian!

Only it does cause problems.

The letters сч are also transliterated as sch, and even make the same sound! (Maybe this is why they chose it?) But if one string of letters (sch) can be transliterated back into two different Cyrillic outputs (щ and сч), when it is which?

I'll help.

Normally, it's щ as always, but sometimes, it's сч. Words with счастье or some variant thereof are common. The word счёт and the verb считать are other ones, as well as a few others. These words are pronounced with the сч sounding like щ, so щастье, щитать, etc. But they're not spelled that way. You write сч, but say щ.

A lot of Russian prefixes, like рас- and бес-, end in с. The root word could start with ч. In this case, the root is pronounced separate from the word and they do not blend together into a щ sound. For example, расчертить. It also happens in a few other cases like переписчик. An unusual case is расщепить, which would be rasschepit'. But рассчитать would be rasschitat'. From the transliterations alone, how do you know that one is ссч and the other is сщ? You don't. You have to guess here or use a dictionary. (The сombinations жч and зч also get pronounced as щ, like мужчина and резчик, but in this case, the transliteration with a zh or a z makes it clear which Cyrillic letter to use.)

If the letter щ was still transliterated as shch (and most people still do), none of these problems would happen, and it would be unambiguous. But when you move in with the new sch, these issues pop up. How should we address them?

TheHockeyist commented 7 years ago

I think the best we can do here is a case-by-case basis.

Tymewalk commented 7 years ago

Similar to "j", could we check for "shch" and decide then?

TheHockeyist commented 7 years ago

Actually, perfect solution!

щ is not a very common letter, though.

Tymewalk commented 7 years ago

@TheHockeyist Even better, it means we'll only have very few cases to deal with.

TheHockeyist commented 7 years ago

@Tymewalk Well, although it's worth 10 points in Russian Scrabble and there's only one tile for it (excluding the blanks), there are still a lot of words with щ. Just because you can play щи every time to get rid of it doesn't mean that's the only word in the language. See my comments above. Щ comes up about 0.30% of the time in the entire Russian language. This means that it's less common than k in English, but still more common than j. In fact, щ is about as common in Russian as the letter j is in German.

Think of some of the words in English or German that contain a j.

A similar number of words in Russian are expected to contain щ.

It's not the rarest letter in Russian (that award goes to ъ), but still, it's not just "very few cases".