reutenauer / polyglossia

An alternative to Babel for XeLaTeX and LuaLaTeX
http://www.ctan.org/pkg/polyglossia
MIT License
190 stars 51 forks source link

"--- shortcut for em dash is rendered as three hyphens #445

Closed intractabilis closed 4 years ago

intractabilis commented 4 years ago

The documentation states that if babelshorthands = true, then "--- is rendered as em dash. However in practice it's just substituted with three hyphens.

MWE:

\documentclass[12pt, letterpaper]{scrbook}

\usepackage[autopunct]{csquotes}
\usepackage{polyglossia}

\setdefaultlanguage[babelshorthands = true]{russian}
\SetLanguageKeys{russian}{
    hyphenmins  = {2, 2}
  , indentfirst = true
  }

% it's easier to notice with Garamond
\setmainfont{EB Garamond}[
    Numbers    = {Proportional, Uppercase}
  , RawFeature = {+ss6}
  ]
\newfontfamily\cyrillicfont[
    Numbers           = {Proportional, Uppercase}
  , RawFeature        = {+ss6}
  ]{EB Garamond}

\setsansfont{FreeSans}
\newfontfamily\cyrillicfontsf{FreeSans}
\setmonofont{FreeMono}
\newfontfamily\cyrillicfontmono{FreeMono}

\KOMAoptions{fontsize = 12, BCOR = 8.25mm}
\KOMAoptions{DIV = 14}

\pghyphenation{russian}{
    фран-цу-жен-кою
}

\begin{document}

Все счастливые семьи похожи друг на друга, каждая несчастливая семья
несчастлива по-своему.

Всё смешалось в доме Облонских. Жена узнала, что муж был в связи с бывшею в их доме
француженкою-гувернанткой, и объявила мужу, что не может жить с ним в одном доме. Положение это
продолжалось уже третий день и мучительно чувствовалось и самими супругами, и всеми членами семьи, и
домочадцами. Все члены семьи и домочадцы чувствовали, что нет смысла в их сожительстве и что на
каждом постоялом дворе случайно сошедшиеся люди более связаны между собой, чем они, члены семьи и
домочадцы Облонских. Жена не выходила из своих комнат, мужа третий день не было дома. Дети бегали по
всему дому, как потерянные; англичанка поссорилась с экономкой и написала записку приятельнице,
прося приискать ей новое место; повар ушел вчера со двора, во время самого обеда; чёрная кухарка и
кучер просили расчёта.

На третий день после ссоры князь Степан Аркадьич Облонский "--- Стива, как его звали в свете, "--- в
обычный час, то есть в восемь часов утра, проснулся не в спальне жены, а в своем кабинете, на
сафьянном диване. Он повернул свое полное, выхоленное тело на пружинах дивана, как бы желая опять
заснуть надолго, с другой стороны крепко обнял подушку и прижался к ней щекой; но вдруг вскочил, сел
на диван и открыл глаза.

\textquote{Да, да, как это было? "--- думал он, вспоминая сон. "--- Да, как это было? Да! Алабин
давал обед в Дармштадте; нет, не в Дармштадте, а что-то американское. Да, но там Дармштадт был в
Америке. Да, Алабин давал обед на стеклянных столах, да, "--- и столы пели: Il mio
tesoro\footnote{Моё сокровище (\textit{итал.}).} и не Il mio tesoro, а что-то лучше, и какие-то
маленькие графинчики, и они же женщины}, "--- вспоминал он.
\end{document}

Result: изображение

Should be: изображение

jspitz commented 4 years ago

The problem here is the following: Since the Cyrillic dash (0.8em length) is not available in Unicode and most fonts, polyglossia (and babel) provide a command \cyrdash which attempts to emulate such a dash. This just blends two en-dashes into each other at the correct length:

\def\cyrdash{\hbox to.8em{--\hss--}}%

Now in your case, TeX ligatures are disabled, so -- is not merged into an endash (as you can see if you enter -- and --- literally). This is since you redefine font options. Adding:

\newfontfamily\cyrillicfont[
    Numbers = {Proportional, Uppercase},
    RawFeature = {+ss6},
    Ligatures = TeX
  ]{EB Garamond}

would work.

Anyway, the more robust solution is to use \textendash in the definition of \cyrdash, and I will do that.

LSinev commented 4 years ago

@kia999 Can you please comment on using \textendash in the definition of \cyrdash nowadays? https://tex.stackexchange.com/questions/294178/what-about-cyrdash-in-eu1-and-eu2-encodings https://tex.stackexchange.com/questions/139748/cyrillic-em-dash-is-rendered-inconsistently

intractabilis commented 4 years ago

Since the Cyrillic dash (0.8em length)

Hmm... this is the first time I hear that. A quick look into Russian Wikipedia reveals (in my translation): "contemporary typesetting rules don't define the length of the Russian tiret (loan French word for em dash): it is implicitly assumed that there exists a unique tiret symbol defined by the font design".

jspitz commented 4 years ago

Hmm... this is the first time I hear that. A quick look into Russian Wikipedia reveals (in my translation): "contemporary typesetting rules don't define the length of the Russian tiret (loan French word for em dash): it is implicitly assumed that there exists a unique tiret symbol defined by the font design".

If so, how do you enter this in LaTeX?

intractabilis commented 4 years ago

If so, how do you enter this in LaTeX?

I am a fairly recent LaTeX user and always used LuaTeX, so... just as a Unicode u2014: "—". Or even "~—" to avoid breaking the line before "—" according to the Russian typesetting rules. I was curious about using "---, because I thought that maybe it will take care of not breaking the line before the "—" without ~, you know, in a true LaTeX spirit of separating the content and presentation. It wasn't because of a special length.

jspitz commented 4 years ago

I am a fairly recent LaTeX user and always used LuaTeX, so... just as a Unicode u2014: "—". Or even "~—" to avoid breaking the line before "—" according to the Russian typesetting rules.

OK, then literal — or --- or \textemdash is what you want.

"--- is something different.

I was curious about using "---, because I thought that maybe it will take care of not breaking the line before the "—" without ~, you know, in a true LaTeX spirit of separating the content and presentation. It wasn't because of a special length.

No, that is a misunderstanding. Apparently some people need the slightly shorted emdash.

intractabilis commented 4 years ago

Thank you for clarification. Anyway, it seems like it was a valid concern and I am glad it is fixed.

intractabilis commented 4 years ago

I accurately studied Soviet's state standards for fonts, Cyrillic books of precomputer era, textbooks on typography. Nowhere I could find a "Russian tiret" shorter than the Cyrillic letter М. In some fonts it was even slightly longer. Standards of that time in fact didn't even have a "tiret" picture, but by looking at the actual books, I can say that everywhere I looked the "Russian tiret" length corresponds to the length of emdash in contemporary computer fonts.

Therefore, by working on this feature of "short Cyrillic dash", you are satisfying quite extravagant request of fringe individuals. IMHO this shouldn't be a part of a mainstream package.

Thousand times more useful for everyone typing Russian texts would be a feature of a normal length "--- with automatic Russian typography rules applied: fixed size 2pt spaces on both sides (or only on the right, if the emdash starts a paragraph), plus the left space is unbreakable.

LSinev commented 4 years ago

@intractabilis will you enforce elongating of "russian tiret" for pdflatex in babel package, which existed AFAIK for 20+ years (and probably even longer than the whole polyglossia package)? It is drastical change actually.

jspitz commented 4 years ago

@LSinev no worries we won't change that (due to backwards compatibility). @intractabilis you can simply redefine \cyrdash for your needs:

\def\cyrdash{\textemdash}

The spaces on both sides are already accounted for: "--- adds a 0.2em (unbreakable) space on both sides.

intractabilis commented 4 years ago

@intractabilis will you enforce elongating of "russian tiret" for pdflatex in babel package, which existed AFAIK for 20+ years (and probably even longer than the whole polyglossia package)? It is drastical change actually.

It's a strange request. Why I should enforce something in babel? You are as if suggesting that I am asking something for myself. (Btw, there is no such thing as elongated tiret or shorten tiret. There is one tiret, which is emdash in Russian typesetting).

I just shared my opinion of what I believe is the best for the community. If you disagree, we can leave everything as is. If other people agree, we can work together to make the best choice of enforcing or not enforcing something in babel after a discussion. There is no need for "please enforce something in babel and then we will talk". With regard of my personal needs I don't care, I am good.

As I said, there is no such thing as "Cyrillic dash (0.8em length)" in the Russian typesetting. Some extravagant person introduced it to babel 20 years ago, probably she needed it in some of her documents. This "drastic" change will change her 20 years old documents if she ever switches them to polyglossia. This is bad, I guess. But the real question is: why a poor choice of some individual 20 years ago should influence what the rest of people needs?

intractabilis commented 4 years ago

The spaces on both sides are already accounted for: "--- adds a 0.2em (unbreakable) space on both sides.

This is not what you said before. However, this anyway doesn't correspond to the typesetting rules.

ivankokan commented 4 years ago

Why don't you ask for changes in babel as well?

intractabilis commented 4 years ago

Why don't you ask for changes in babel as well?

I don't know. As I said, my personal needs are well taken. I had in mind a discussion of what is the best way for a community of Russian text authors to deal with this strange non-existent "Cyrillic dash (0.8em length)". If we agree that this should be changed, we can ask babel. Or I can ask babel with your support. Otherwise, I am fine, there is no "me" in this.

intractabilis commented 4 years ago

Btw, how to ask babel? Or rather include them into the discussion?

jspitz commented 4 years ago

In any case I think this should not be changed, for backwards compatibility reasons.

If given enough evidence, I am open to introduce an option to opt-in another dash appearance.

kia999 commented 4 years ago

Русское тире на 20% короче. Первые разработчики русификации LaTeX'а это специально подчеркивали. Посмотрите русские шрифты LH в формете MetaFont. Я не намерен менять эту установку в угоду разного рода дилетантов.

Igor A. Kotelnikov

чт, 1 окт. 2020 г. в 14:47, intractabilis notifications@github.com:

Btw, how to ask babel? Or rather include them into the discussion?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/reutenauer/polyglossia/issues/445#issuecomment-701956019, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGNT64AL67BGQMW6EHMJSLSIQX2ZANCNFSM4RZAWKDA .

jspitz commented 4 years ago

Русское тире на 20% короче. Первые разработчики русификации LaTeX'а это специально подчеркивали. Посмотрите русские шрифты LH в формете MetaFont. Я не намерен менять эту установку в угоду разного рода дилетантов. Igor A. Kotelnikov чт, 1 окт. 2020 г. в 14:47, intractabilis notifications@github.com:

Which Google translates for me to:

The Russian dash is 20% shorter. The first developers of the Russification of LaTeX are specially emphasized. Look at Russian fonts LH in MetaFont format. I I do not intend to change this attitude to please all kinds of amateurs.

jspitz commented 4 years ago

Please do not close. We close after the initial fix has been released.

intractabilis commented 4 years ago

Русское тире на 20% короче. Первые разработчики русификации LaTeX'а это специально подчеркивали. Посмотрите русские шрифты LH в формете MetaFont. Я не намерен менять эту установку в угоду разного рода дилетантов. Igor A. Kotelnikov чт, 1 окт. 2020 г. в 14:47, intractabilis notifications@github.com:

Здравствуйте, Игорь. Приятно встретить знакомого человека (я учился в НГУ, если что). Дилетант в данном случае вы. Я изучил госты 57-го и 71-го годов (последние действуют до сих пор) и учебники типа классики Шульмейстера. Я также изучил историю создания стандартизованных в госте русских шрифтов с начала прошлого века. Я пересмотрел обычную литературу, выпущенную советскими издательствами в докомпьютерную эру. Нигде нет никакого короткого тире. Везде ширина тире равна ширине буквы М, кроме узких шрифтов. В узких гостовских шрифтах ширина тире больше и соответствует ширине буквы М в нормальном, не узком шрифте. Тире уже чем М — это figment of imagination авторов "русификации LaTeX'а" и не имеет отношения к русской типографике.

intractabilis commented 4 years ago

The Russian dash is 20% shorter. The first developers of the Russification of LaTeX are specially emphasized. Look at Russian fonts LH in MetaFont format. I I do not intend to change this attitude to please all kinds of amateurs.

He is wrong and quite arrogant.

jspitz commented 4 years ago

Please do not make this a American-Presidency-TV-battle kind of discussion.

intractabilis commented 4 years ago

Please do not make this a American-Presidency-TV-battle kind of discussion.

I am trying. However, Igor bluntly called me an amateur. It's difficult to keep a civilized discussion when someone starts attacking a person instead of discussing the issue.

intractabilis commented 4 years ago

Please do not close. We close after the initial fix has been released.

Sorry, It was an accident. I was shocked by the Igor attack and started doing something hectic.

intractabilis commented 4 years ago

If given enough evidence, I am open to introduce an option to opt-in another dash appearance.

It's more about there is no evidence of an existence of a standard of a 20% shorter than emdash "Russian tiret". Neither in standardization documents, nor in textbooks, nor in actual books printed in precomputer era, nor in history notes.

jspitz commented 4 years ago

I do not understand your qualms. "--- is only available as a babel shorthand since there was no easy way to enter this 0.8em dash in TeX, and apparently users requested that.

If you prefer an 1em dash with a half space, you can simply use \,---. This is one key stroke more. A babel shorthand for this would be just redundant.

jspitz commented 4 years ago

BTW also the extdash package might be of interest for you.

kia999 commented 4 years ago

\textemdash, \textendash and \cyrdash all have different lengths as shown by attached example. All forums that I visited have no idea about dash which is shorter than emdash but longer than endash. These are people knowing nothing except for Word.

\documentclass{article}

\usepackage[russian]{babel} \usepackage{calc} \begin{document} \newlength{\len} \Huge \noindent \textemdash\ \verb|\textemdash| = \setlength{\len}{\widthof{{\textemdash}}} \the\len\ \cyrdash\ \verb|\cyrdash| = \setlength{\len}{\widthof{{\cyrdash}}} \the\len\ \textendash\ \verb|\textendash| = \setlength{\len}{\widthof{{\textendash}}} \the\len \ \end{document}

Igor A. Kotelnikov igor.kotelnikov@gmail.com

Budker Institute of Nuclear Physics SB RAS Lavrentyev Av. 11, Novosibirsk, 630090, Russia Tel: +7 383 3294268 (office)

Fax: +7 383 330 71 63 Mobile: +7 913 933 24 84 (WhatsApp, WeChat)

чт, 1 окт. 2020 г. в 18:10, Jürgen Spitzmüller notifications@github.com:

I do not understand your qualms. "--- is only available as a babel shorthand since there way no easy way to enter this 0.8em dash in TeX, and apparently users requested that.

If you prefer an 1em dash with a half space, you can simply use ---\,. This is one key stroke more. A babel shorthand for this would be just redundant.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/reutenauer/polyglossia/issues/445#issuecomment-702062410, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGNT65AQJE3MQBXRMAPRMTSIRPTVANCNFSM4RZAWKDA .

jspitz commented 4 years ago

To expand a bit on @kia999's answer: The traditional Cyrillic encodings (T2A, T2B, T2C, and X2) actually contain a real 0.8em dash glyph (at pos. 22) which is mapped to \cyrdash. See: http://mirrors.ctan.org/macros/latex/required/cyrillic/cyoutenc.pdf With these encodings, even \textemdash (and thus ---) are redefined to output this shorter dash, so (unfortunately IMHO) these commands produced different results depending on the encoding.

Now AFAIU the original aim of the babel shorthands we discuss here is to provide (1.) quick access to this glyph (with different usage variants) and (2.) provide a faked form of the glyph also for other encodings than the ones mentioned (via a generally available \cyrdash macro).

With polyglossia, the situation is somewhat different since we always use TU font encoding (unicode) and there is no 0.8em dash in unicode. Thus the character is always faked (as opposed to 8bit LaTeX). (Note that there is also a \textthreequartersemdash [0.75em obviously] provided via textcomp which also doesn't have an equivalent in unicode)

I don't interfere in the discussion on whether this dash is a LaTeX artifact or represents some convention of Russian typography. The point of babel shorthands, though, is to provide some sort of convience function for people coming from babel. And I think they can rightly expect that the semantics of these shortcuts is the same than in babel.

intractabilis commented 4 years ago

I do not understand your qualms. "--- is only available as a babel shorthand since there was no easy way to enter this 0.8em dash in TeX, and apparently users requested that.

I have no qualms. My idea was that a babel shorthand could be used to automatically follow the Russian typesetting rules, if it is changed to represent the real Russian punctuation. So far it is just a preference of a limited fringe group. "20% shorter than emdash" punctuation doesn't exist neither in the Ryssian typographic tradition nor in any existing or past Russian standards of typesetting.

intractabilis commented 4 years ago

All forums that I visited have no idea about dash which is shorter than emdash but longer than endash.

It's because it doesn't exist in the Russian typographic tradition or in any Russian typographic standard.

These are people knowing nothing except for Word.

I don't appreciate your continuing trend of launching personal attacks on other people.

intractabilis commented 4 years ago

The point of babel shorthands, though, is to provide some sort of convience function for people coming from babel. And I think they can rightly expect that the semantics of these shortcuts is the same than in babel.

It's a fair point. Though, it's difficult to say what the majority expects. They can expect, for example, that it works corresponding to the typesetting rules, which don't have "20% shorter than emdash" punctuation. In Russian typesetting at least during all 20th century the corresponding punctuation symbol always was equal to the width of letter М.

I started this discussion actually to probe what people expect from "---.

jspitz commented 4 years ago

You cannot probe what people expect (without doing a representative survey). But even if you did, this does not falsify the backwards compatibility argument.

intractabilis commented 4 years ago

You cannot probe what people expect (without doing a representative survey).

Nothing's ever easy. But I had to start from something in a hope that after people are aware of the issue of non-existent punctuation symbol they would help to turn the discussion into something constructive.

this does not falsify the backwards compatibility argument.

It's true. However, we have to ask ourselves: why a poor decision of a a fringe group of people should result in popularization of a non-existent in the typography tradition or in any standard punctuation. I don't feel like backwards compatibility argument is fair in this case.

kia999 commented 4 years ago

Русские шрифты LH и первые версии русского пакета для babel делали Ольга Лапко и Ирина Маховая. Они работали в научных издательствах и знали, что делают. Надо их спрашивать, зачем была добавлена команда \cyrdash и соответсвующее тире. Вот адрес странички Ольги Лапко:

https://www.researchgate.net/scientific-contributions/Olga-Lapko-31801700

Igor A. Kotelnikov

чт, 1 окт. 2020 г. в 20:57, intractabilis notifications@github.com:

You cannot probe what people expect (without doing a representative survey).

Nothing's ever easy. But I had to start from something in a hope that after people are aware of the issue of non-existent punctuation symbol they will help to turn the discussion into something constructive.

this does not falsify the backwards compatibility argument.

It's true. However, we have to ask ourselves: why a poor decision of a a fringe group of people should result in popularization of a non-existent in the typography tradition or in any standard punctuation. I don't feel like backwards compatibility argument is fair in this case.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/reutenauer/polyglossia/issues/445#issuecomment-702153863, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGNT66UFY2PHAEOUBL2DDDSISDDLANCNFSM4RZAWKDA .

intractabilis commented 4 years ago

Русские шрифты LH и первые версии русского пакета для babel делали Ольга Лапко и Ирина Маховая. Они работали в научных издательствах и знали, что делают. Надо их спрашивать, зачем была добавлена команда \cyrdash и соответсвующее тире. Вот адрес странички Ольги Лапко: https://www.researchgate.net/scientific-contributions/Olga-Lapko-31801700

Спасибо, я почитаю статьи и попробую с ними связаться, чтобы прояснить историю вопроса. О результах доложу. :)

intractabilis commented 4 years ago

Jürgen, can we meanwhile leave the following comment in the polyglossia documentation next to "---:

This shorthand produces a dash that is 20% shorter than emdash in the correspondent font. Note that this length was chosen due to babel's backward compatibility and is aimed to have a symbol similar to a specific glyph present in the first russification of LaTeX. Check carefully your typesetting requirements before using it, because traditionally Russian typography uses a regular length emdash for typesetting of a “Russian tiret”.

LSinev commented 4 years ago

traditionally Russian typography uses a regular length emdash for typesetting of a “Russian tiret”.

This statement should be citing some source(s). Opposite statement too. It is more safe and tolerant just have full stop after using it.

ivankokan commented 4 years ago

Btw, how to ask babel? Or rather include them into the discussion?

https://github.com/latex3/babel https://github.com/kia999/babel-russian

intractabilis commented 4 years ago

This statement should be citing some source(s). Opposite statement too. It is more safe and tolerant just have full stop after using it.

Instead of following Aristotle mistakes of trusting someone's quotes, I suggest catching flies and counting their legs. Gosh, I am shocked of how much backlash I am getting from the Russian community. Guys, I don't need it. I am trying to convince you to open your eyes and see what is in front of your eyes: there is no 0.8 emdash in Russian books. You don't like how I am suggesting to improve the documentation? Suggest something better. Or let's eff it and forget it. I am giving up.

Альбер Камю. Бунтующий человек. Москва. Политиздат. 1990: изображение

Б.А. Добровин, С.П. Новиков, А.Т. Фоменко. Современная геометрия. Москва «Наука». 1979 изображение

В.И. Смирнов. Курс высшей математики. Том 4. Москва «Главиздат». 1953 изображение

Г. Джэффрис, Б. Свирлс. Методы математической физики. Москва «Мир». 1969 изображение

Г.И. Пухальский, Т.Я. Новосельцева. Проектирование дискретных устройств на интегральных микросхемах. Москва «Радио и связь». 1990 изображение

Ганс Христиан Андерсен. Сказки и истории. Том I. Ленинград «Художественная литература». 1969 изображение

И.Г. Араманович, В.И. Левин. Уравнения математической физики. Москва «Наука». 1969 изображение

И.П. Степаненко. Основы телрии транзисторов и транзисторных схем. Москва «Энергия». 1967 изображение

М.А. Красносельский. Приближённое решение операторных уравнений. Москва «Наука». 1969 изображение

М.М. Вайнберг, В.А. Треногин. Теория ветвлений решений нелинейных уравнений. Москва, «Наука». 1969 изображение

Наука и жизнь. No 4. 2009 изображение

Общая биология. Учебник для 9-10 классов средней школы. Москва «Просвещение». 1988 изображение

Ф.М. Достоевский. Собрание сочинений в 6-ти томах. Том 6. Идиот. Ленинград «Наука», 1989 изображение

Ю. Рытхэу. Время таяния снегов. Москва «Молодая гвардия». 1981 изображение

intractabilis commented 4 years ago

Still convinced that the Russian tiret is 20% shorter than emdash? Or would you like to participate in doing something about this? Those are random picks from my bookshelf.

LSinev commented 4 years ago

AFAIK, Russian font design has its own problems and solutions considering spacing for the purpose of reading and comprehension. Books, describing it, maybe only in libraries and not scanned (not even OCRed). Maybe there is no M-rule for Russian font design but something like Ш-rule. Adding to the problem, for many years hardware of typographies were mostly of french or german origins (as European technology leaders or as a result of war contribution) so the absolute number of books printed with some sort of Russian rules of width was small (changing letters was an actual need, changing signs... "existing are good enough"). And by the time when font encodings for computers were created (or spreading), english-speaking programmers were first to implement their samples and rules (so we had (or still have) problems with 7-bit and 8-bit encodings in latex). That's why there are packages with some german and french typesetting rules for latex.

Maybe even \cyrdash of specific length was the way to set proper Russian typesetting (where the changing letter in the font is the same in terms of money as changing of other glyphs) for the informational era. It is just easier to wait for comments from https://github.com/reutenauer/polyglossia/issues/445#issuecomment-702277574 or search for some USENET discussions with them of 1990s (if logs can be found).

Offtop (no info about tiret width or length): https://vk.com/doc-21205379_152396071?hash=1aaafb52c3cc70357a — about font design for posters from 1977 (never thought of so many things have to be considered for fonts and posters) https://vk.com/wall-53068949_1670 — book of 1922, on typesetting (!) where some sort of graphical translit was used to print cyrillic of that time with latin letters.

ivankokan commented 4 years ago

Maybe there is no M-rule for Russian font design but something like Ш-rule.

I remember seeing this concept somewhere... enumitem (https://www.ctan.org/pkg/enumitem)!

8 More about counters

\AddEnumerateCounter*{\asbuk}{\c@asbuk}{7}
intractabilis commented 4 years ago

I am giving up and not going to waste my time anymore. I will let the community to live in a bubble of their believes.

ivankokan commented 4 years ago

I am giving up and not going to waste my time anymore. I will let the community to live in a bubble of their believes.

I did not want to "choose side" with my last comment, I just remembered seeing something related to the width of Ш somewhere else.

kia999 commented 4 years ago

Please re-intiate discussion of various dashes (tirets) in https://tex.stackexchange.com/ with the tag cyrillic.

Igor A. Kotelnikov

вт, 6 окт. 2020 г. в 14:45, Ivan Kokan notifications@github.com:

I am giving up and not going to waste my time anymore. I will let the community to live in a bubble of their believes.

I did not want to "choose side" with my last comment, I just remembered seeing something related to the width of Ш somewhere else.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/reutenauer/polyglossia/issues/445#issuecomment-704093749, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGNT63YF56YJB5KHG6AHKDSJLDIFANCNFSM4RZAWKDA .

kia999 commented 4 years ago

Тут утверждается, что в юникоде есть три разных тире, но не сказано, какие именно:

http://newforum.gramota.ru/viewtopic.php?f=15&t=6032

Igor A. Kotelnikov

вт, 6 окт. 2020 г. в 20:52, Igor A. Kotelnikov igor.kotelnikov@gmail.com:

Please re-intiate discussion of various dashes (tirets) in https://tex.stackexchange.com/ with the tag cyrillic.

Igor A. Kotelnikov

вт, 6 окт. 2020 г. в 14:45, Ivan Kokan notifications@github.com:

I am giving up and not going to waste my time anymore. I will let the community to live in a bubble of their believes.

I did not want to "choose side" with my last comment, I just remembered seeing something related to the width of Ш somewhere else.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/reutenauer/polyglossia/issues/445#issuecomment-704093749, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGNT63YF56YJB5KHG6AHKDSJLDIFANCNFSM4RZAWKDA .

intractabilis commented 4 years ago

I did not want to "choose side" with my last comment, I just remembered seeing something related to the width of Ш somewhere else.

There were no such rules in the classic Soviet typography with fonts selected from a stock left after the Revolution by commercial manufacturers. There couldn't be such rules, because it wasn't a computer era where you can play with arbitrary lengths. And there were no font designers in the beginning of the 20th century in Soviet Union. The length of the tiret was simply as it is in the fonts standardized in GOST, period. It's just so happened that in the GOST fonts the length of tiret is equal to М (in narrow font equal to wide М). If it is equal to Ш too, it's fine. It's just not related to the problem at hand, because 20% shorter dash is not used in the traditional Russian typography regardless of whether what is really used equal in length to М or Ш. So discussing it is not constructive.

Regardless of any sides the documentation must be updated. I gave you a very well balanced phrase that gives a user the right information and gives a fair warning without taking any sides. You don't like it? Come up with your own, but this must be done.

This was a concrete, constructive and practical step suggested. Since this simple thing failed to realize, I stopped investing my time.

kia999 commented 4 years ago

Legacy engines such as latex or pdflatex use 8-bit encodings such as OT1, T1; and \cyrdash is defined by Cyrillic encodings such as T2A, T2B, T2C, e.t.c. Babel with option russian, ukrainian, e.t.c. automatically loads T2A encoding by default (or other appropriate encoding loaded last by fontenc package called before babel) and redefines ligature --- through \cyrdash. T2* encodings take \cyrdash and \textemdash from a same codepoint 22 of LH font if Russian language is selected; then their sizes are equal to 8pt (in case of 10pt document); if English is current language then usually Latin encoding is loaded where \cyrdash undefined in CM fonts (rises an error), and \textemdash is taken from CM fonts where its length is 10pt, ie longer as compared to itself in case if Russian is selected.

In case of xelatex, \cyrdash macro is not defined in unicode encoding TU so babel fakes is as \hbox to 0.8em{--\hss--} which turns out to be approximately 80 percent of \textemdash.

Ulrike Fischer directed me to a valuable list of all available dashes in Unicode (e.g. http://jkorpela.fi/dashes.html). I did not find a reasonable candidate for \cyrdash there although from my own experience of publishing books I know that editors of Russian publishing houses sometimes are very rigorous regarding the size of dashes. I would also agree that Microsoft Word automatically produces correct dashes between russian words but I am not sure (and cannot check right now) if it produces longer dashes between English words.

Igor A. Kotelnikov igor.kotelnikov@gmail.com

ср, 7 окт. 2020 г. в 01:48, intractabilis notifications@github.com:

I did not want to "choose side" with my last comment, I just remembered seeing something related to the width of Ш somewhere else.

There were no such rules in the classic Soviet typography with fonts selected from a stock left after the Revolution by commercial manufacturers. There couldn't be such rules, because it wasn't a computer era where you can play with arbitrary lengths. And there were no font designers in the beginning of the 20th century in Soviet Union. The length of the tiret was simply as it is in the fonts standardized in GOST, period. It's just so happened that in the GOST fonts the length of tiret is equal to М (in narrow font equal to wide М). If it is equal to Ш too, it's fine. It's just not related to the problem at hand, because 20% shorter dash is not used in the traditional Russian typography regardless of whether what is really used equal in length to М or Ш. So discussing it is not constructive.

Regardless of any sides the documentation must be updated. I gave you a very well balanced phrase that gives a user the right information and gives a fair warning without taking any sides. You don't like it? Come up with your own, but this must be done.

This was a concrete, constructive and practical step suggested. Since this simple thing failed to realize, I stopped investing my time.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/reutenauer/polyglossia/issues/445#issuecomment-704480492, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGNT67JDYCDGZA4CQ2QSZLSJNRBDANCNFSM4RZAWKDA .

intractabilis commented 4 years ago

Word uses u2014 em dash (equivalent to \textemdash) regardless of the language.