avian2 / unidecode

ASCII transliterations of Unicode text - GitHub mirror
https://pypi.python.org/pypi/Unidecode
GNU General Public License v2.0
516 stars 62 forks source link

Add support for chess pieces and playing card suits #93

Closed penguinland closed 7 months ago

penguinland commented 7 months ago

These might be more verbose than you'd prefer; I'm open to renaming them if you want.

Thanks for maintaining Unidecode! I'm a slow reader but a fast listener, and I make frequent use of a program that takes whatever's in my clipboard, sends it through Unidecode, and sends the results to a text-to-speech engine whose input must be ASCII. It works great, though recently I've been going through articles about a card game, and it's been hard to use (hence the PR).

avian2 commented 7 months ago

Thanks for your contribution. I'm happy you find Unidecode useful.

I would add black/white to suit names so that black and white symbols can be differentiated. If that works for you I can merge your change.

penguinland commented 7 months ago

Could you give a use case for when someone would want to differentiate them? I thought the purpose of having two different versions of the red suits in Unicode was to better render them in both monochrome (♣♢♡♠) and color (♣♦♥♠) (Github won't let me use color, but imagine the red suits were red, like this example). and then it's strange to have variations of half the suits but not the others, so ♧ and ♤ were included for symmetry. This is why ♠♡♢♣ is one contiguous run of code points and ♤♥♦♧ is another: the first four are for monochrome texts, and the next four are their alternatives.

I'm pushing back on this because adding the words "black" and "white" to the texts I'm examining would only add noise to the output. I envisioned something similar to the way that U+24B6 (Ⓐ), U+1D538 (𝔸), and U+1D49C (𝒜) all become A via Unidecode.

avian2 commented 7 months ago

I don't have a use case. My thinking was that since Unicode makes a distinction between white and black symbols it might be useful to someone. You're right that, at least for the card games I'm aware of, the suits are red and black. White and black would be confusing.

I tried looking it up in the Unicode standard since sometimes there is some rationale included in there. The card suit symbols have been included already since Unicode 1.0. Unfortunately the text only says that the symbols are included for typesetting card game manuals (Unicode 1.0, page 94) and that there was some attempt at giving the codes a meaningful order.

I'm probably overthinking this. I'm merging your original pull request. I think it is an improvement over the current state and if someone later finds a use case they can suggest another change.

Thanks again.